Journal of 
Experimental Psychology 


ARTHUR W. MELTON, Editor 
Am Foacz Personne. awn Traininc Reszance Center 
Laczitawp Ase Force Basz 
San Antomro, Texas 


CONSULTING EDITORS 
Lloyd G. AP Personnel and Treia- 
ing Center 
Arthur L. Irion, Tulane University 


Donald B. Lindsley, University of California, 
Los Angeles 


Neal E. Miller, Yale University 

Kenneth W. Spence, State University of Iowa 
Benton J. Underwood, Northwestern University 
Delos D. Wickens, Ohio State University 


Lorraine Bouthilet, Managing Editor 





CONTENTS 


Reminiscence and Forgetting in Motor Learning After Extended Rest Intervals: 
J. C. Jamwxz amp C. P. Duncan 273 
The Effect of Verbal Reinforcement Combinations on Conceptual Learning : 
A. H. Buss amp E. H. Buss 283 
Acquisition and Extinction With Different Verbal Reinforcement Combinations : 
A. H. Buss, W. Brapen, A. Once, ann E. H. Buss 288 
The Relation of Anxiety (Drive) Level to Performance in Competitional and Noncompeti- 
tional Paired-Associates Learning: K. W. Srancz, I. E. Fansze, awn H. H. McFanw 296 
Anxiety (Drive) Level and Degree of Competition in Paired-Associates Learning : 
K. W. Srzncz, J. Tayios, ann R. Kercae 306 
Learning and Extinction Based Upon Frustration, Food Reward, and Exploratory Tend- 
ency: H. M. Ape_man anv J. L. Maatscu 
Concept Identification as a Function of Task Complexity and Distribution of Practice: 
F. G. Brown anv E. J. Ancuzn 316 
Inverted-Alphabet Printing as a Function of Intertrial Rest and Sex: 
E. J. Ancuzr amp L. E. Bovang, Ja. 322 
Reduction of Error With Practice in Perception of the Postural Vertical: C. M. Sourzy 329 
Listening to Overlapping Calls: E. C. Poutton 
Surprise as a Factor in the Von Restorff Effect: R. T. Guzen 





American Psychological Association 
Vol. 52 No. 5 November 1956 





his: (EH 
Le at ! ! 


© 1956 by the American Paychological Association, Inc. 


i 
Hit Heap 


i 
lf 








Journal of 





Experimental Psychology 
Va eis 2), — Noveaper, 1956 








REMINISCENCE AND FORGETTING IN MOTOR LEARNING 
AFTER EXTENDED REST INTERVALS! 


JOHN C. JAHNKE?*? AND CARL P. DUNCAN 


Northwestern University 


In motor learning the occurrence of of Jz. It presumably depresses per- 
a relative depression of performance formance relatively permanently after 
under massed practice (MP), the rest, i.e., it explains incomplete re- 
partial recovery from this depression covery from the performance de- 
over rest, and the fact that the re- pression produced by MP. 
covery sometimes does not reach the Since s/,» is assumed to be a process 
level of a distributed practice (DP) which persists after / has dissipated, 
group which has been given the same __ it is important, for measuring s/,, to 
amount of prerest practice, have all know how long is the period required 
been accounted for (3, 11, 19) by for all Jz to dissipate. All investi- 
means of two inhibitory processes, gators have supposed that no further 
most commonly of the type postu- J,» remains after 10-20 min. of rest, 
lated by Hull (9). One of these basing their assumption on such 
processes, reactive inhibition (J/g), is studies as (3,6, 7,8, 10, 11,13). But 
conceived to be an unlearned negative a pilot experiment which led to the 
drive state, built up by repeated present study suggested that traces 
responding, which serves to depress of Jy may remain after intervals as 
performance during MP. Also, J/g is long as a day or even a week. So the 
held to dissipate spontaneously to first purpose of the present experiment 
zero within relatively short (10-20 is to test for dissipation of J» over 
min.) rests (3, 13) and thus per- considerably longer periods than have 
formance may increase over rest previously been employed. 
(reminiscence). Conditioned inhibi- Another assumption made to sup- 
tion (slr), the second process, is port s/z is that the same amount of 
conceived to be a learned response habit strength is built up under all 
which is motivated by the presence degrees of practice distribution. The 

‘This paper is based partly on the first assumption has been stated explicitly 
author’s Ph.D. thesis at Northwestern Uni- by at least one writer (12) and 
versity and partly on work done in connection has been accepted by most workers. 


with Grant B-792 to the second author from the N . 
—- ; Nevertheless, the assumption has 
U. S. Public Health Service. ’ P 


? Now at Indiana University, Jeffersonville been strongly criticized (18) and some 
Center. (e.g., 4, 16, 19) have attempted to 


273 





274 


equate degree of learning for MP and 
DP groups by various means. Since 
forgetting is a measure of habit 
strength, one way to make at least a 
partial approach to the problem of 
MP and DP habit strengths is to 
examine the course of forgetting, 
particularly over intervals so long 
that there is likely to be little or no 
confounding of forgetting by even 
residual traces of Jy. Thus, the 
second purpose of the present study 
is to measure forgetting of MP and 
DP groups over intervals appreciably 
longer than a week, since, as noted, it 
may turn out that some /, is present 
up to a week, With such data in 


hand, there may be better grounds for 
evaluating the assumption of equal 
habit strengths on which s/x depends. 


Metuop 


Subjects.—The Ss were 440 male students, 
drawn mostly from introductory psychology 
classes at Northwestern University. All were 
naive to the rotary-pursuit task used. 

Apparatus.—The pursuit rotor had a turn- 
table of 14.3 cm. radius and was driven by a 60- 
rpm motor, A brass target disc .87 cm. in 
radius was set flush with the surface of the 
turntable, The center of the target was 7.8 cm. 
from the center of the turntable. The S held a 
hinged brass stylus so he could follow the target 
without being able to press down on it. When 
the stylus was in contact with the target, a 
circuit was closed to the .Ol-sec. Standard 
Electric clocks, which recorded time on target. 
During the cycles of 10-sec. work and 20-sec. 
rest, which were used in one of the DP condi- 
tions, the rotor ran automatically each trial for 
about 12 sec., and the clocks alternated in 
recording the 10-sec, middle portions of these 
periods. Each 12-sec, operating period was then 
followed by a rest period of approximately 18 
sec. in which the rotor was stopped, During 
the 5-25 work-rest cycles, used in another of the 
DP conditions, the rotor ran about 7 sec. on each 
trial, and the clocks recorded the 5-sec. middle 
portions of these periods, ‘The operating period 
was followed by a period of about 23 sec. in 
which the rotor was stopped. In the MP con- 
ditions the rotor ran continuously, with the 
clocks alternating in recording every 30 sec. 
The scores were recorded to the nearest .01 sec. 

Procedure.—Twenty-two groups of 20 Ss each 
were formed as follows. On starting the experi- 


JOHN C. JAHNKE AND CARL P. DUNCAN 


ment, each § received five trials with the 10-20 
cycle. The total time on target obtained in 
these trials was used to match groups, S for S, 
for all 22 groups. A running total was kept for 
each § during the matching trials, and S was 
assigned to one of the experimental conditions 
during the 20-sec. rest following the last match- 
ing trial. 

A set of six DP (10-20 cycle) groups and a set 
of six MP (continuous practice) groups then 
immediately received 6 min. of prerest practice. 
One group from each set of six was then given a 
rest of either 10 min., 1 day, or 1, 2, 3; or 4 wk. 
Upon returning after rest, all Ss were given 21 
postrest trials with the 10-20 cycle. ‘The same 
work-rest cycle was used for all groups on post- 
rest trials to permit comparison of postrest 
performance without the confounding effects of 
differential [2 developing under different work- 
rest cycles. 

The 10-20 cycle for DP groups, continuous 
practice for MP groups, and several minutes of 
prerest practice, are all typical procedures with 
the pursuit rotor. But pilot work indicated 
that, with 6 min. of prerest practice, some J, 
accumulates even with the 10-20 cycle. It 
further appeared that, with equal amounts of 
prerest practice, the forgetting curve of the MP 
Ss might fall considerably below that of the 
10-20 Ss even over the long (1-4 wk.) intervals. 
It was therefore decided to make a better 
attempt to approximate the forgetting curve of 
the MP groups over the 1—4 wk. rests by running 
groups that would satisfy two conditions: (a) 
prerest practice with a work-rest cycle of such 
highly distributed practice that there would be 
no accumulation of Jpg to distort the forgetting 
curve, and (b) an amount of prerest practice that 
would lead to a level of performance after |-wk. 
rest that closely approximated the level of per- 
formance of the MP Ss after l-wk. rest. Com- 
parison of these highly distributed groups and 
the MP groups in terms of the course of for- 
getting over 1-4 wk. should provide further 
information on the problem of how much habit 
strength is developed by the MP Ss during 
prerest practice. 

An earlier study (17) had suggested that no 
Tp accumulates with a 5-25 work-rest cycle, so 
this was the cycle used during prerest practice 
for the highly distributed groups. Preliminary 
work indicated that if such groups were given 
somewhere between 20 and 30 trials of prerest 
practice, their performance level after a week's 
rest would approximate that of the MP Ss after 
a week. So two sets of 5-25 groups were run, 
one set given 20 trials (1} min.), the other set 
given 30 trials (24 min.) of prerest practice. 
Six groups, one for each rest interval, were 
trained on the 5-25 20-trial condition, and four 
groups, one for each of the four longest rests, 





REMINISCENCE AND FORGETTING 


were trained on the 5-25 30-trial condition. 
Since it was expected that the 5-25 conditions 
would be most useful in studying forgetting 
from the l-wk. point on, only one of these con- 
ditions (the 20-trial) was provided with groups 
tested after the two shortest rests. The 5-25 
Ss were, like all the other Ss, given both the 
initial period of 5 matching trials with the 10-20 
cycle, and 21 postrest trials with the 10-20 cycle. 

Thus, for testing after the six lengths of rest, 
there are six groups given 6 min. of 10-20 cycle 
DP during the prerest session, six groups given 
6 min. of continuous prerest practice (MP), and 
six groups given 20 trials (14 min.) of prerest 
practice with the 5-25 cycle. For testing after 
the four longest (1-4 wk.) rests, there are four 
groups given 30 trials (24 min.) of prerest 
practice with the 5-25 cycle. 

All Ss were tested individually and were 
instructed concerning the task and procedure 
before starting the experiment. Complete 
instructions are not given here, but Ss were told 
that they would receive short periods of work 
followed by longer periods of rest, that they 
might receive a change of instructions during the 
course of practice, that they should use a smooth 
rotating motion of the arm (demonstrated) in 
following the target, and that they should at all 
times try as hard as they could to stay on the 
target. 

In conditions having 10-min. rests, Ss re- 
mained in the experimental room and conversed 


275 


with E, but not about the experiment. The 
Ss in longer rest conditions went about their 
normal activities at the end of the prerest 
practice session. 

Before starting the postrest trials, all Ss were 
briefly reinstructed in the task. In these in- 
structions one point was emphasized for those 
Ss who had had prerest practice either with 
continuous practice or the 5-25 cycle. This was 
that the rotor would now start and stop as it 
had for them in the very beginning (i.e., 10-20 
cycle on matching trials). 


RESULTS 


The prerest practice curves for the 
four learning conditions (MP, 10-20, 
5-25 20-trial, and 5-25 30 trial) are 
presented in Fig. 1. The initial short 
segment of curve (through 50 sec. of 
practice) in Fig. | shows performance 
of all Ss on the matching trials. 
Beyond this there are learning curves 
(smoothed by inspection) for (a) the 
5-25 conditions, (b) the 10-20 con- 
dition, and (c) the MP condition. 
Since all retention groups in a par- 
ticular learning condition received 
identical treatment prior to rest, 





MEAN PER CENT TIME ON TARGET 





10-20 OP 

me 

5-25 30-TRIAL OP 
5-25 20-TRIAL OP 


rN + 





+ 
. 


3 


7 


4 6 


MINUTES OF PRACTICE 


Fie, 1. 


Prerest performance curves for all conditions. 





276 





PER CENT Tame ON TARGET 











¥0 
-_———9- 2200 
—< © 
20 + 928501 oF 
. ——+ 629 20-1 OF 
di. 
a &™ , a™. vo. oo 
LENGTH OF REST 
Fic, 2. Initial postrest performance 


for all groups. 


points for the 10-20, 5-25 20-trial, 
and MP conditions are means of 
scores of 120 Ss, while points for the 
5-25 30-trial condition are based on 
80 Ss. The superiority in perform- 
ance (but not necessarily in habit 
strength) of all DP conditions over 
the MP condition during prerest 
practice is clear. 

Recall.—The first purpose of this 
study was to determine how much 
recovery from the prerest performance 
depression would occur in the MP Ss 
when long enough rests were given to 
permit all /» to dissipate. Since it is 
customary to measure recovery from 
Ip (reminiscence) in terms of per- 
formance on the first postrest trial, 
these points are plotted for all groups 
in Fig. 2. 

With reference to the MP and 10-20 
DP curves in Fig. 2, it can be seen 
that both groups reminisced above 
their final prerest level after 10 min. 
of rest. The reminiscence gain in the 
DP group yielded a related-measures 
t of 3.44 (19 df); for the MP group t 
was 5.14. From 10 min. to 1 day, 
and to some extent from | day to 1 
wk., the curves are quite different 
(whereas beyond 1 wk. the curves 
have much the same slope). The DP 
groups show a loss (which might be 
either forgetting or warm-up decre- 


JOHN C. JAHNKE AND CARL P. DUNCAN 


ment); the related-measures F for the 
comparison of the 10-min., l-day, and 
1-wk. points is 10.48 (P < .01, df = 2 
and 38). In contrast, the MP groups 
show no loss (in fact, a slight gain) 
from 10 min. to 1 wk.; the three re- 
tention points are not significantly 
different (F <1, df =2 and 38). 
Thus, since the DP groups showed a 
drop in performance from 10 min. to 
1 wk., while the MP groups did not, 
there are grounds for suggesting that 
some process, such as further dissi- 
pation of Jz, counteracted overt drops 
in performance in the MP groups. 

Figure 2 suggests that the MP 
groups do not perform as well as the 
10-20 groups over the 1-4 wk. 
intervals. Using the analysis of vari- 
ance procedure described by Lind- 
quist (14, p. 145, 239 ff.), the F 
between groups is significant beyond 
the .O1 level. The results of this 
analysis may be seen in Table 1. 
Thus, even after all or nearly all /z 
has (apparently) dissipated, com- 
parison in terms of the first postrest 
trial indicates either that the MP 
groups have not developed as much 
original habit strength as the 10-20 
groups, or have developed s/x. 

It will be recalled that in case the 
MP forgetting curve did fall con- 


TABLE 1 


Awatysis or Variance or Initia Postrest 
MP anv 10-20 DP Perrormance 
Over THe 1-4 Wx. Retention 
INTERVALS 


| a | mS | F 


Source of Variation 
Rest Interval (1) 
Distribution of 





Practice (D) | 1 | 17.44 | 13.85* 
—— (L) 19 | 9.39| 7.46° 
x D 3 24 . 
IxL | 57 1.45) 1.15 
DXL 19 | 1.54] 1.23 
IxXDXL dl ond 
*P <M, 











REMINISCENCE AND FORGETTING 


siderably below the 10-20 curve over 
the 1-4 wk. intervals, the two 5-25 
conditions were run as a_ possible 
further aid in evaluating the MP 
group’s performance from | to 4 wk 
Returning to Fig. 2, it should be 
noted that, contrary to expectation, 
those 5-25 Ss that were given only 
20 prerest trials showed reminiscence 
gain over 10 min. This gain is 
significant at the .O1 level (t = 5.61, 
19 df). However, it is primarily 
performance of the 5-25 groups over 
the 1-4 wk. intervals that is of interest 
here. 

Figure 2 indicates that performance 
of the 5-25 groups, in terms of first 
postrest trial over 1-4 wk., is some- 
what higher then performance of the 
MP Ss over the same intervals. The 
F between the 20-trial groups and the 
MP groups over the four long in- 
tervals was 3.78, not significant with 
1 and 120 df. The F between the 
30-trial 5-25 groups and the MP 
groups was 9.50, P < .O1, 1 and 120 
df. (The last two F values reported 
were obtained by combining the scores 
of the 1-4 wk. groups within each of 
the three practice conditions and then 
testing the differences between se- 


lected conditions by simple analysis 


of variance. Combining groups in 


$ 


PER CENT TIME ON TARGET 
3 





ee 


Oo; s 0 
TRIALS 


Fic. 3. Postrest performance curves for each 
practice condition. The 1-4 wk. retention 
groups have been combined within each 
condition. 


na £a 


© 


PER CcNT TIME ON TARGET 
* 


2 


- i 
LENGTH OF REST 
Final postrest performance 
for all groups. 


this manner was justified since within 
each condition the variances were 
homogeneous, and the 1-4 wk. means 
did not yield a significant F for 
any of the three practice conditions. 
These Fs also show that there was no 
significant forgetting from 1 to 4 wk. 
in any of the three conditions.) 

Relearning.—Do the results found 
with the first postrest trial as the 
measure hold up during the remainder 
of postrest practice? The answer to 
this may be seen in Fig. 3, where the 
1-4 wk. retention subgroups have 
been combined within each practice 
condition and the resulting means 
plotted for each postrest trial. Figure 
3 shows that the first postrest trial 
would be a poor estimate of the habit 
strength of the MP condition. On 
the first trial, performance of the MP 
Ss is the lowest of all conditions, but 
in later postrest practice these Ss 
perform better than those in either of 
the 5-25 conditions. 

In view of this finding, postrest 
performance for the four conditions 
over time is shown in Fig. 4 in terms 
of means of the last 10 postrest trials 
(relearning measure). 
that 
from 


Figure 4 shows 
there were considerable gains 
the final prerest level to the 
points indicating postrest performance 
after 10-min. rest, but these gains 








278 


TABLE 2 


Gaiw ou Finat Postraest Perrormance, iN 
Teams or Pex Cent Time on Tancert, 
Over tHe Rest Inrervars Inpvicatep 























10 Min. to 1 Day Rest! 1 Day to 1 Wk. Rest 
Group _— a 
MeanGain| ¢ | Mean Gain ll 
a detelictnn on ~ |—___— 
MP 798 |301% | 2.03 61 
5-25 6.58 |2.37° | 1,57 59 
10-20 | 2.35 85 | 20 07 








*P < 05,19 d/, 
“P< O1,19 a, 


were, of course, due to a combination 
of reminiscence and early postrest 
practice. However, there were fur- 
ther gains, which could only be due 
to the additional rest, from 10 min. 
to 1 day and, to a lesser degree, from 
Idaytolwk. These gains are shown 
in Table 2, where it may be seen that 
the 10-min. to 1-day gain is significant 
in both the MP and the 5-25 condi- 
tions, but not in the 10-20 condition. 
Thus it is clear that, in terms of this 
relearning measure, postrest perform- 
ance may not only be higher, but may 
be significantly higher, with rest 
intervals up to 1 day in length. 

None of the I-day to I-wk. gains 
shown in Table 2 is significant sta- 
tistically, but the fact that there is a 
numerical gain for all groups rather 
than a loss over this period suggests 
that recovery from residual traces of 
Ig may continue over rests as long as 
several days. In fact, just when the 
effects of accumulated J/g have com- 
pletely disappeared cannot be deter- 
mined; the slight overt drops in 
performance after 1 wk., shown in 
Fig. 4, might still be the net result of 
forgetting and recovery from Ja. 

Figure 4 shows that, from the I-wk. 
point on, the 10-20 groups perform 
better than the MP groups. The 
Detween-condition differences are 


highly significant (F = 11.03, df = 1 


JOHN C. JAHNKE AND CARL P. DUNCAN 


and 120, P < Ol). (This difference 
is somewhat overestimated because 
of the large difference, which appears 
to be attributable to sampling error, 
between the MP and 10-20 groups at 
the 3-wk. point.) The comparison of 
the MP and 5-25 groups suggests 
that, contrary to the inferences from 
the recall data, the MP groups have 
learned more than either of the 5-25 
groups. A test of the difference 
between the MP and 5-25 30-trial 
groups over the 1-4 wk. intervals 
indicates that the MP group is 
superior at the .05 level (F = 479, 
df= 1 and 120). (Both the pre- 
ceding F values were computed after 
combining, as before, the scores for 
the four retention subgroups within 
each practice condition. Combining 
groups in this manner was justified 
since analyses of variance performed 
separately on the 1-4 wk. groups 
within each condition yielded non- 
significant F values, again indicating 
no significant forgetting from 1—4 wk., 
and the assumption of homogeneity 
of variance was met.) 
Warm-up.—The presence of a large 
warm-up decrement in the MP con- 
dition (see Fig. 3) is apparently the 
major reason for the discrepancy 
between recall and relearning meas- 
ures. Because more extensive warm- 





——e 0-20 OF 





$ ou —=9 P » 
——4 525 20° TRIAL OP 
~- 45°25 30-TRIAL OF 
10 Wav ian — "ae 
man, 


LENGTH OF REST 


Fic. 5. 


Warm-up as a function of practice 
condition and length of rest. 





REMINISCENCE AND FORGETTING 


up data are available here than have 
previously been published, these data 
are presented in Fig. 5. The measure 
of warm-up, which is similar to that 
used by Adams (1), is the simple gain 
in score from the first to the fifth 
postrest trials. With these gains, a 
higher score indicates greater warm- 
up, i.e., greater decrement due to the 
need to warm-up. 

An analysis of variance was per- 
formed on the data of Fig. 5 for the 
three practice conditions (MP, 10-20, 
and 5-25 20-trial) that were tested at 
all six time intervals. Time interval 
was significant (F = 3.23, P < Ol, 
5 and 190 df), and Fig. 5 indicates 
that in general warm-up first in- 
creases and then decreases with in- 
creasingly longer intervals. Condi- 
tion of practice was also significant at 
the .O1 level (F = 8.77, 2 and 190 df). 
However, two variables, original 
degree of learning and degree of 
massing of practice, are confounded 
in this test. It can only be suggested, 
from inspection of all four curves in 
Fig. 5, that warm-up may increase 
both with degree of learning and with 
massing of practice. 


Discussion 


The first purpose of the present study 
was to determine if Jr continued to 
dissipate over longer intervals than a 
few minutes. This was found to be 
true. Both the recall and the relearning 
measures suggested that residual traces 
of the Jz accumulated after a few 
minutes of prerest practice still remain 
after a rest as long as one day, perhaps 
longer. The evidence was especially 
clear-cut in the relearning data (Fig. 4). 
Here, a significant postrest gain was 
found from 10 min. to 1 day in two out 
of three conditions, and further slight 
but not significant gains were found in 
all three conditions with rests up to a 
week. It is not known from this experi- 


279 


ment whether the slight (and not sig- 
nificant) performance losses over rests 
greater than one week indicate that 
dissipation of Jp is complete in a week, 
or whether forgetting of the rotor habit 
is then masking any further gains. 

These results concerning dissipation 
of Ig permit two points to be made 
concerning slp. First, since Jp may 
not be entirely dissipated within the 10 
min. or so of rest that is usually allotted 
for its complete decay, part of what has 
previously been called s/e may have 
been merely undissipated residual Jp. 
Second, s/r has been measured as the 
amount by which the first postrest trial 
(sometimes corrected for warm-up) of 
MP groups falls below that of DP 
groups. But the present data, and other 
studies (2, 15), show that initial postrest 
performance of MP Ss may be a very 
poor index of their over-all postrest 
performance; relearning measures may 
indicate considerably less postrest de- 
pression than recall measures. 

The second purpose of the present 
study was to examine the forgetting 
curves of MP and DP groups over long 


intervals in order to provide some in- 
formation bearing on the issue of whether 
the postrest depression of performance 
in MP Ss is attributable to s/x or to less 


initial habit strength. To further aid 
this purpose, an attempt was made to 
carry the 5-25 groups in prerest practice 
to the point where their performance 
would match that of the MP Ss after a 
week’s rest. The reasoning was that 
if the forgetting curve of the MP Ss 
matched that of the 5-25 Ss all the way 
from 1-4 wk., there would be grounds for 
inferring that the MP Ss had originally 
developed no more habit strength than 
the 5-25 Ss (who clearly had not de- 
veloped as much habit strength as the 
10-20 §s). 

This aspect of the study was not com- 
pletely successful. In the first place, 
even the groups given only 20 trials with 
the 5-25 cycle accumulated some Jp, so 
there was no group in which performance 
could be used as an unconfounded meas- 
ure of habit strength at both short and 





280 


long rest intervals. Secondly, the 5-25 
groups were practiced to the point where 
their performance would match that of 
the MP groups on the first postrest trial. 
But this resulted in seriously under- 
estimating the MP groups’ habit strength 
as measured by relearning. 

In spite of these difficulties, the similar 
shapes of the curves of all conditions over 
the 1-4 wk. intervals lead the writers to 
favor the view that the clear-cut postrest 
performance depression of the MP Ss, as 
compared to the 10-20 Ss, represents 
smaller initial habit strength, rather than 
slr. The evidence most strongly fa- 
voring this view is the rank order of all 
four relearning curves after Trial 9 in Fig. 
3 and over all time intervals in Fig. 4. 
Except at a few points, the rank order 
of the groups is exactly what would be 
expected as the result of different 
amounts of original habit strength 
(amounts which are known to be dif- 
ferent for the two 5-25 and the 10-20 
conditions). It is true, however, that 


an interpretation in terms of s/p cannot 
be ruled out. 
Turning now to the warm-up data, 


it was shown that warm-up decrement 
changes significantly as a function of 
length of rest, and Fig. 5 suggests that 
warm-up first increases and then de- 
creases with increasing rest. This result 
is in partial accord with that of Bell (5), 
but differs from that of Ammons (3), 
who found warm-up decrement to be only 
an increasing function of length of rest. 
However, Ammons used rests up to only 
6 hr. in length. 

Warm-up was also shown to be some 
complex function of degree of learning 
and degree of massing of practice. Thus, 
the nature of warm-up is not well under- 
stood. If, as has been suggested (3), 
warm-up represents “loss of set,’’ one 
should at least find it to be an increasing 
function of length of rest. The present 
results, and perhaps those of Bell (5), 
indicate that this is not the case. 

One other result is worth noting: there 
was little or no forgetting of the pursuit 
habit over any interval up to 4 wk. 


JOHN C. JAHNKE AND CARL P. DUNCAN 


There was no forgetting measured in 
terms of relearning. With recall as the 
measure, the 10-20 Ss showed a loss in 
performance up to a week’s rest, and the 
5-25 Ss showed a loss up to a day’s rest 
(Fig. 2). However, it is not clear that 
even these losses should be called for- 
getting in view of the finding that all 
groups showed increasing warm-up decre- 
ments up to intervals as long as two 
weeks (Fig. 5). 

Because of the confusing effects of 
warm-up at recall, the writers believe 
that reliance should be placed mainly 
on the relearning data, both for the issue 
of dissipation of Jp and for the issue of 
habit strength versus sJp in the MP 
groups. But then the question arises, 
is the relearning measure affected by 
differential transfer from the work-rest 
cycles used in prerest practice to the 
10-20 cycle used in postrest practice? 
Two points can be made against any 
important influence of differential trans- 
fer. First, as noted earlier, instructions 
immediately preceding postrest practice 
emphasized the fact that the cycle to be 
used in postrest practice was one with 
which § was already familiar, since his 
very first five trials had been with this 
cycle. Second, Fig. 3, and the data of 
two other studies (2, 15), show that in 
changing from one work-rest cycle to 
another, stabilization of performance 
under the new cycle occurs within the 
first 10 trials or less after the change. 
In fact, Reynolds and Adams (15) 
develop an hypothesis, based on their 
own and other data, which is essentially 
the opposite of differential transfer (and 
which would also dispose of s/pr). The 
hypothesis is that performance under 
any work-rest cycle is, except for an 
initial brief period of adjustment, a 
function only of that cycle (and, of 
course, original number of trials or habit 
strength), without regard to previous 
experience with any other cycle. Thus, 
it is not believed that the present re- 
learning measure (mean of postrest 
Trials 12-21) is influenced by differential 
transfer. 





REMINISCENCE AND FORGETTING 


SUMMARY 


The study had two purposes: first, to deter- 
mine if reactive inhibition (J/g) continues to 
dissipate over intervals longer than a few minutes 
in groups given prerest massed practice (MP), 
and second, to examine the course of forgetting 
of the pursuit habit over fairly extended in- 
tervals to attempt to provide information 
bearing on the issue of conditioned inhibition 
(slp) in MP Ss. ‘Toward these ends, 22 groups 
of 20 men each were practiced on a pursuit rotor. 
Two sets of six groups each received 6 min. of 
prerest practice followed by rests of either 10 
min., | day, or 1, 2, 3, or 4 wk. One of these 
sets received prerest practice in cycles of 10 sec. 
work—20 sec. rest; the other set (MP) received 
continuous prerest practice. Of the remaiaing 
10 groups, six (one for each rest interval) were 
given 14 min., and four (one for each of the four 
longest rest intervals) were given 24 min. of 
prerest practice with a 5 sec. work—25 sec. rest 
cycle. Postrest practice was 21 trials (34 min.) 
with the 10-20 cycle for all groups. 

The essential findings were: 

1. Both initial and final postrest performance 
differences between corresponding length-of-rest 
MP and 10-20 groups became less with increased 
rest up to one day, and, to some extent, up to 
one week. These results were taken to mean 
that J/g continues to dissipate for as long as a 
They also indicate 
that previous measurements of slr may have 
included some residual undissipated /p. 

2. Although MP and 
10-20 groups became less with increased rest, 
the MP Ss performed significantly below the 
10-20 Ss over all intervals up to 4 wk. It was 
suggested that this was due to less original habit 
strength developed in the MP Ss, but it was 
pointed out that an interpretation in terms of 


day, and possibly longer. 


differences between 


slp could not be ruled out. 
3. Comparison of MP and 5-25 Ss over the 


various rest intervals gave opposite results 


depending upon whether the measure used was 


initial or final postrest trials. ‘The initial post- 


rest trial was apparently confounded by warm-up 
decrement, which was most severe in MP groups. 
Because of this, and because final postrest per- 


formance more nearly corresponded to original 
degree of learning, it was suggested that final 
postrest performance is a better measure of most 
postrest phenomena in motor learning. 

4. Warm-up decrement was shown to first 
increase and then decrease with increasing rest. 
Warm-up was also affected in a complex fashion 
by both original degree of learning and deg-ee of 
massing of practice. 


281 


5. There was little or no evidence of forgetting 
of the pursuit habit over any rest interval 


REFERENCES 


Avams, J. A. Warm-up decrement in per- 
formance on the pursuit-rotor. Amer. J. 
Psychol., 1952, 65, 404-414 

Apams, J. A., & Reynowps, B. Effects of 
shift in distribution of practice conditions 
following interpolated rest. J. exp. Pry 
chol., 1954, 47, 32-36. 

Ammons, R. B. Acquisition of motor skill: 
II. Rotary pursuit with 
continuous practice before and after a 
single rest. /. exp. Psychol., 1947, 37, 
393-411. 

Arcuer, FE. J. 
motor learning as a function of prerest 


J. exp. 


performance e 


Postrest performance in 
degree of distribution of practice 
Psychol., 1954, 47, 47-51 

Beit, H.M. Rest pauses in motor learning 
as related to Snoddy’s 
mental growth. Psychol 
S4, No. 1 (Whole No. 243) 

Bitopeau, FE. A 
in a simple motor task before and after a 
single rest. J. exp. Prychol., 1952, 43, 
381-390. 

Fius, D. S., Monrcomery, V., & Unver- 
woop, B. J. Reminiscence in a manipu 
lative task as a function of work-surface 
height, prerest practice, and interpolated 

J. exp. Prychol., 1952, 44, 420-427 

. Gace, G R., & Reynoups, B. Effect of 

varying amounts of rest on conventional 


hypothesis of 


Monogr , 1942, 


Performance decrement 


rest. 


and bilateral transfer “reminiscence.” / 
exp. Psychol., 1952, 44, 247-252 
Hi LL, L Princ iples of behavior 
York: D. Appleton-Century, 1943 
Inion, A. L. 


learning as a 


New 


Reminiscence in pursuit-rotor 
and of 


J. exp. Pry 


function of rest 
amount of prerest practice 
chol., 1949, 39, 492-499, 

Kimace, G. A 
two-factor theory of inhibition. J xp. 
Psychol., 1949, 39, 15-23. 

Kimere, G. A 
motivation in determining the amount of 


An experimental test of a 


Evidence for the role of 
reminiscence in pursuit rotor learning. /. 
exp. Psychol., 1950, 40, 248-253. 
Kimace, G. A., & Honenstein, B. Remi 
niscence in motor learning as a function 
of length of interpolated rest. /. 
Prychol., 1948, 38, 239-244. 
14. Linpquist, FE. F. Design and analysis of 
experiments in psychology and education. 


Chicago: Houghton Mifflin, 1953. 


(xp. 





282 


15. Revnwoups, B., & Apams, J. A. Effect of 
distribution and shift in distribution of 
practice within a single training session. 
J. exp. Prychol., 1953, 46, 137-145. 

16. Scuucxer, R. E., Stevens, L., @ Extis, 
D.S. A retest for conditioned inhibition 
in the alphabet-printing task. J. exp. 
Psychol., 1953, 46, 97-102. 

17. Sranxweatuer, J. A., & Duncan, C. P. 
A test for conditioned inhibition in motor 


learning. J. exp. Psychol., 1954, 47, 
351-356. 


JOHN C. JAHNKE AND CARL P. DUNCAN 


18. Unperwoop, B. J. Learning. 


nz & 


Stone (Ed.), Annual review of psychology. 


Stanford, Calif.: Annual Rev 
1953, 

19. Wasserman, H. N. The effect 
vation and amount of preres 
upon inhibitory potential 


learning. J. exp. Psychol., 
162-172. 


(Received October 17, 1955) 


iews, Inc., 


of moti- 
t practice 
in motor 


1951, 42, 





Journal of Experimental Psychology 
Vol. 52, No. 5, 1956 


THE EFFECT OF VERBAL REINFORCEMENT 
COMBINATIONS ON CONCEPTUAL 
LEARNING! 


ARNOLD H. BUSS AND EDITH H. BUSS 


Carter Memorial Hospital, Indianapolis 


Two previous studies (1, 2) have 
investigated the effect of verbal rein- 
forcement combinations on learning 
and extinction. The specific rein- 
forcement combinations were: the E 
says Right for a correct response, 
Wrong for an incorrect response 
(Right-W rong); Nothing for a correct 
response, Wrong for an incorrect re- 
sponse (Nothing-Wrong); and Right 
for a correct response, Nothing for an 
incorrect response (Right-Nothing). 
There were two major findings: (a) 


Right-Nothing results in significantly 
slower learning than either Right- 
Wrong or Nothing-Wrong, and (bd) 
Right-Wrong 


and Nothing-Wrong 
yield similar rates of learning. 

These data cannot be explained by 
the commonly accepted verbal rein- 
forcement continuum, which assumes 
that Right is as strong a positive 
reinforcer as Wrong is a negative 
reinforcer. According to this con- 
tinuum Right-Wrong offers more posi- 
tive reinforcement of the correct 
response than Nothing-Wrong and 
more negative reinforcement of the 
incorrect response then Right-Noth- 
ing. Since more differential rein- 
forcement leads to faster learning, the 
Right-Wrong combination has a dis- 
tinct advantage over the other two 
combinations, and it should yield 
significantly faster learning. Right- 
Nothing offers more positive rein- 
forcement of the correct response than 
Nothing-Wrong, but this advantage is 


1The writers acknowledge the helpful criti- 
cism of Jerry Wiggins. 


283 


cancelled by the greater negative rein- 
forcement of the incorrect response by 
Nothing-Wrong; therefore the Right- 
Nothing and Nothing-Wrong combi- 
nations should lead to similar rates of 
learning. 

Since these expectations were not 
borne out in the first study (1), a 
reformulation of the verbal reinforce- 
ment continuum was attempted (2). 
It was assumed that Nothing is a 
nonreinforcer and that Right is a much 
weaker positive reinforcer (approach- 
ing Nothing) than Wrong is a negative 
reinforcer. Thus Right-Wrong is 
similar to Nothing-Wrong, and both 
of these combinations offer more 
differential reinforcement than does 
Right-Nothing In other words the 
critical component is Wrong, and the 
combinations with Wrong (Right- 
Wrong and Nothing-Wrong) should 
yield faster learning than the com- 
bination without Wrong (Right-Noth- 
ing). These predictions coincide with 
the results reported previously (1, 2). 

The theoretical reinforcement con- 
tinuum was tested in a very limited 
experimental learning situation. The 
paradigm was that of stimulus gen- 
eralization: the dimension was height, 
and during acquisition only stimuli of 
a given height presented. 
Thereafter generalization was tested 
without extinction (1) or with extinc- 
tion (2) on stimuli of varying heights. 
Furthermore, S was limited to making 
only one of two possible responses. 
Obviously the proposed verbal rein- 
forcement continuum can have little 


were 





284 


generality because of the severe limi- 
tations of the experimental conditions 
under which it was tested. There- 
fore, the present experiments attempt 
to apply the theoretical continuum to 
a conceptual learning situation in 
which S has more than two possible 
responses. The predictions for this 
new situation are essentially the same 
as for the previous experimental 
situations: (a) the Right-Wrong and 
Nothing-Wrong groups will learn new 
concepts significantly faster than the 
Right-Nothing group, and (b) the 
conceptual learning of the Right- 


Wrong and Nothing-Wrong groups 
will be approximately the same. 


Experiment I 
Method 


Subjects.—The Ss were 45 neuropsychiatric 
patients of both sexes who were not receiving 
shock treatment, not mentally defective, and 
not diagnosed as having organic brain damage. 
Fifteen Ss were assigned to each of three groups 
in the order of their appearance. 

Material.—The Wisconsin Card Sorting Test 
(4) was used, In this test there are stimulus 
cards and response cards, ‘The stimulus cards 
(1 red triangle, 2 green stars, 3 yellow crosses, 
and 4 blue circles) are placed in the upper set of 
four divided compartments. ‘The S sorts a deck 
of 64 response cards into the lower set of com- 
partments under the stimulus cards on the basis 
of the shape, the color, or the number of the 
figures printed on each card. 

Procedure.—The S was handed the deck of 
response cards and told to sort them into the 
bottom set of compartments according to where 
he thought they belonged. To all queries ZL 
replied, “Just go ahead and sort them where 
you think they belong.” 

The first concept to be learned was shape. 
Placing a response card under a stimulus card 
having figures of the same shape was correct; 
all other sortings were incorrect. The criterion 
of learning was 10 consecutive correct trials. 

When S had learned the shape concept to the 
criterion, the concept was changed to color 
without S being told of this change. Again the 
criterion of learning was 10 consecutive correct 
trials. If S did not make 10 consecutive color 
sortings by the time the deck of 64 response 


ARNOLD H. BUSS AND EDITH H. BUSS 


cards was exhausted, he was handed another 
(identical) deck. If the color concept was not 
learned to the criterion by 100 trials, S was 
stopped. 

There were three groups of Ss, each with a 
different verbal reinforcement combination: 
Right for a correct response, Wrong for an in- 
correct response (Right-Wrong); Nothing for a 
correct response, Wrong for an incorrect response 
(Nothing-Wrong); and Right for a correct re- 
sponse, Nothing for an incorrect response ( Right- 
Nothing). The same combination was employed 
in the learning of both concepts. 


Results 


The response measure used is the 
number of trials to learn, excluding 
the 10 consecutive correct trials. The 
mean and median number of trials to 
learn the shape concept is zero for all 
three groups. Only three Right- 
Wrong Ss, one Nothing-Wrong S, and 
no Right-Nothing S required more 
than zero trials to learn the shape 
concept. Since 41 out of 45 Ss 
started sorting for shape immediately, 
we may conclude that these Ss were 
“set” for shape. Virtually all of the 
Right-Wrong Ss and all of the Right- 
Nothing Ss received only the Right 
component of their respective rein- 
forcement combinations; all but one 
of the Nothing-Wrong Ss _ received 
only the Nothing component of their 
reinforcement combination. Since 
most of the Ss did not receive both 
components of the reinforcement com- 
binations, the “‘learning’’ of the shape 
concept cannot be used as a basis for 
testing our predictions. 

After making 10 consecutive correct 
shape responses, all Ss were required 
to shift to a color concept. The 
learning data for color are presented 
in Table 1. 

The Right-Nothing group differs from 
the other two groups in that its mean 
is much larger and its SD is much 
smaller than those of the Right-Wrong 
and Nothing-Wrong groups. The 
smaller Right-Nothing SD may be 





VERBAL REINFORCEMENT COMBINATIONS 285 


TABLE 1 


Number or Triats To Surrt rrom a SHare 
To a Coton Concept in Exp. | 


Group 


Right-Wrong 
Nothing-Wrong 
Right-Nothing 


attributed to the ceiling of 100 trials. 
Two Right-Wrong Ss, five Nothing- 
Wrong Ss, and 14 Right-Nothing Ss 
failed to learn the color concept to a 
criterion of 10 consecutive correct 
trials, and their score was 100 trials. 
These frequencies yield a chi square 
of 17.34 that is significant at the .0O] 
level. Thus the groups differ sig- 
nificantly in the number of Ss who 
learned the color concept. 

These data do not meet the as- 
sumptions required to make the usual 
t tests; therefore the Cochran-Cox (3) 
test was used for both these data and 
the data of Exp. II. The three com- 
parisons were: Right-Wrong vs. Noth- 
ing-Wrong t = .88, p > .05; Right- 
Wrong vs. Right-Nothing t = 7.08, 
P < OO1; and Right-Nothing vs. 
Nothing-Wrong t = 4.79, P < .OO1, 
all for 14 df. Thus the Right-Wrong 
and Nothing-Wrong groups learned 
the color concept significantly faster 
than the Right-Nothing group. 


Discussion 


On the basis of the theoretical verbal 
reinforcement continuum it was pre- 
dicted that the conceptual learning of 
the Right-Wrong and Nothing-Wrong 
groups would be similar and that both 
groups would learn significantly faster 
than the Right-Nothing group. These 
predictions could not be tested against 
the “learning’’ of the shape concept 
because evidently almost all the Ss had 
learned to respond to shape before they 
came into the experimental situation, 
and there was no further learning of the 


shape concept during the experiment. 
However, learning the color concept did 
occur during the experiment, and the 
results for color confirmed both of the 
predictions. 

On the conceptual task (Wisconsin 
Card Sorting Test) it is possible to sort 
for number as well as for shape or color. 
Grant (4) has reported that the number 
concept is more difficult to learn than 
the other two. This fact may be of 
significance in interpreting the lack of 
difference between the Rivht-Wrong and 
Nothing-Wrong groups. It is possible 
that these groups, which were found to 
be not significantly different in their 
learning of shape concept, might differ 
in their learning of the difficult 
number concept. Since the similarity 
of the Right-Wrong and Nothing-Wrong 
groups is important for the 
theoretical position, it was decided to 
compare their learning of the number 
concept. 

Earlier in this paper it was noted that 
the reinforcement continuum lacked 
generality because of the limitations of 
the experimental conditions under which 
tested. Its generality is also 
limited by the kind of Ss used, since 
heretofore only neuropsychiatric patients 
have been employed as Ss. Does the 
continuum also apply to normals? In 
an attempt to this question 
student nurses were included in Exp. II. 


more 


present 


it was 


answer 


Experiment II 
Method 


Subjects A total of 55 
patients of both sexes served as Ss 


neuropsychiatric 
bifteen Se 
were discarded for reasons given below, and the 
remaining Ss were divided into two groups of 
20 each. There were also 24 student nurses, 
who were assigned to two groups of 12 each 
Procedure.-The procedure was similar to 
that used in Exp. I, with the following excep 
Only the Right-Wrong and Nothing 
Wrong reinforcement combinations were used? 


tions 


?Grant (4) has reported that the number 
concept is more difficult than either color or 
shape concepts. Experiment I established that 
virtually all Ss respond immediately to shape. 
Learning a number concept requires a shift from 





286 


TABLE 2 


Numper or Tatars to Learn Numper anv 
Coror Concerts tx Exe. Il 


Number 


Group | 


Mean 
Patients 
Right-Wrong 
Nothing-Wrong 


Student Nurses 
Right-Wrong 
Nothing-Wrong 

















The first concept was number, and Ss who did 
not make 109 consecutive correct responses 
within 100 trials were discarded. On this basis 
15 patient Ss were eliminated, seven from the 
Right-Wrong group and eight from the Nothing- 
Wrong group. No nurses were discarded. 

The second concept was color, and again Ss 
were stopped after 100 trials if they had not yet 
made 10 consecutive correct responses. 


Results 


The major data are shown in Table 
2. The patients’ learning will be 
examined first. The Right-Wrong and 
Nothing-Wrong groups learned the 
number concept at almost the same 
rate, the difference between means 
being slight and insignificant.2 The 
Nothing-Wrong group shifted to the 
color concept slightly faster than the 
Right-Wrong group, but again the 
difference is not significant. 

The nurses’ learning followed a 
similar pattern. In the learning of 
both the number and the color con- 


this set for shape. In Exp. I the Right-Nothing 
Ss were unable to shift from shape to color. 
Therefore we assumed that Right-Nothing Ss 
would not be able to shift from shape to number, 
and this group was not included in Exp. II. 

* The difference between means was evaluated 
by means of the Cochran-Cox test (3). The t 
values for both number and color were less than 
1.0; this applies to both the patients’ and the 
nurses’ data. In the comparison of nurses with 
patients none of the ¢ values approached even 
the .10 level of significance. 


ARNOLD H. BUSS AND EDITH H. BUSS 


cepts there were no significant differ- 
ences between the Right-Wrong and 
Nothing-Wrong groups. 

Finally, let us compare the nurses’ 
learning with the patients’ learning. 
The difference in learning between the 
nurse Right-Wrong group and patient 
Right-Wrong group was not significant 
for either the number or the color 
concept. Similarly, the differences 
between the nurse Nothing-Wrong 
group and the patient Nothing-Wrong 
group were not significant. However, 
it should be noted that it was neces- 
sary to discard 15 patient Ss because 
they failed to learn the number con- 
cept. This datum suggests that the 
nurses do tend to learn faster than the 
patients, which is not surprising in the 
light of the motivation and attention 
difficulties so often observed in 
patients. 


Discussion 


The results with both neuropsychiatric 
patients and student nurses clearly 
indicate that the Nothing-Wrong and 
Right-Wrong groups do not differ sig- 
nificantly in their learning new concepts. 
This finding cannot be explained by the 
commonly accepted reinforcement con- 
tinuum, which assumes that Right is as 
strong a positive reinforcer as Wrong is a 
negative reinforcer. If this continuum 
were the correct model, the Right-Wrong 
group should be superior to the Nothing- 
Wrong group. Since these two groups 
learn at similar rates, the model must be 
incorrect. 

On the other hand, one is able to pre- 
dict these results correctly by assuming 
that Right is a much weaker positive 
reinforcer (approaching Nothing) than 
Wrong is a negative reinforcer. The 
critical component is Wrong, and the 
Right component is relatively unessential. 
Since the only difference between the 
Right-Wrong and Nothing-Wrong rein- 
forcement combinations is the addition 
of Right to the Right-Wrong combination, 





VERBAL REINFORCEMENT COMBINATIONS 


this latter group should not learn any 
faster than the former. 

In the two experiments reported here 
an attempt was made to extrapolate from 
a generalization situation to a conceptual 
learning situation, and from data ob- 
tained solely with neuropsychiatric pa- 
tient Ss to data on normals. It was 
assumed that the theoretical continuum 
was not limited to either the generali- 
zation paradigm or to patient Ss. The 
fact that the predictions were confirmed 
in both experiments suggests that the 
assumptions were correct and leads to 
the conclusion that the stated theoretical 
verbal reinforcement continuum pos- 


sesses some generality. 


SUMMARY 


A theoretical verbal reinforcement continuum 
that had been derived from generalization data 
obtained with neuropsychiatric patient Ss was 
applied to a conceptual learning situation 
(Wisconsin Card Sorting Test), with both 
patients and student nurses as Ss. The rein- 
forcement combinations were: E says Right for 
a correct response, Wrong for an incorrect 
response (Right-Wrong); Nothing for a correct 
response, Wrong for an incorrect response 
(Nothing-Wrong), and Right for a correct 
response, Nothing for an incorrect response 
(Right-Nothing). In Exp. I (patients only) Ss 
were required to learn a shape concept and then 
a color concept. There were no differences 
between groups (reinforcement combinations) 
in their learning of the shape concept, but in the 


287 


learning of the color concept the Right-Wrong 
and Nothing-Wrong groups learned at similar 
rates, and both learned faster than the Right- 
Nothing group. 

In Exp. II (patients and nurses) the Ss 
learned a number concept first and then a color 
concept. Only the Right-Wrong and Nothing- 
Wrong combinations were used, and there were 
no differences in their learning of either concept. 
The results of both studies confirmed predictions 
made on the basis of the proposed verbal rein- 
forcement continuum: Nothing is a non- 
reinforcer and Right is a weaker positive rein- 
forcer (approaching Nothing) than Wrong is a 
negative reinforcer. It was concluded that this 
theoretical reinforcement continuum possessed 
some generality. 


REFERENCES 
1. Buss, A. H., Wienexn, M., & Buss, E. 


Stimulus generalization as a function of 
verbal reinforcement combinations, /. 
exp. Prychol., 1954, 48, 433-436. 

2. Buss, A. H., Brapen, W., Oncer, A., & 
Buss, E. Acquisition and extinction 
with different verbal reinforcement com- 
binations. J. exp. Psychol., 1956, 52, 
288-295. 

3. Cocnran, W. G., & Cox, G. M. Experi- 
mental designs. New York: Wiley, 1950, 
Pp. 92-93. 

4. Grant, D. A. Perceptual versus analytical 
responses to the number concept of a 
Weigl-type card sorting test. J. exp. 
Psychol., 1951, 41, 23-29. 


(Received October 3, 1955) 





Journal of Experimental Psychology 
Vol. 52, No. 5, 1956 


ACQUISITION AND EXTINCTION WITH DIFFERENT 
VERBAL REINFORCEMENT COMBINATIONS ! 


ARNOLD H, BUSS, WILLIAM BRADEN, ARTHUR ORGEL, AND EDITH H. BUSS 


Carter Memorial Hospital, Indianapolis 


A previous study (1) investigated 
the effect of verbal reinforcement 
combinations on stimulus generali- 
zation. Psychiatric patients were 
trained to make a verbal response to 
2-in. high wooden discs. ‘There were 
three groups of Ss, each having a 
different reinforcement combination : 
Right for a correct response, Wrong 
for an incorrect response ; Nothing for 
a correct response, Wrong for an 
incorrect response; and Right for a 
correct response, Nothing for an 
incorrect response. After a learning 
series, stimulus generalization was 
tested. The stimuli in the test series 
were 2 in., | in., .5 in. and .25 in. 
high. It was found that the Right- 
Wrong and Nothing-Wrong reinforce- 
ment combinations resulted in sig- 
nificantly steeper gradients of stimulus 
generalization than the Right-Nothing 
reinforcement combinations. There 
were also significant differences among 
groups in the height of the generali- 
zation gradients, the Right-Wrong 
gradient being significantly higher 
than the Nothing-Wrong gradient. 

The results raise a question about 
the reinforcing properties of Right, 
Wrong, and Nothing. Heretofore it 
has been assumed that Nothing is a 
nonreinforcer and that Right and 
Wrong have equal and opposite effects, 
as a positive reinforcer and a negative 
reinforcer, respectively. This rein- 
forcement continuum would be dia- 
grammed as follows: 

' The writers acknowledge the helpful criti- 


cism of Professor W. K. Estes and Dr. Harvard 
Armus. 


Wrong 


Nothing Right 


aa 0 rm 


According to this continuum Right- 
Wrong offers more positive reinforce- 
ment of the correct response than 
Nothing-Wrong and more negative 
reinforcement of the incorrect re- 
sponse than Right-Nothing. Since 
the steepness of the generalization 
gradient varies directly with the 
amount of differential reinforcement, 
the Right-Wrong group should have 
the steepest generalization gradient. 
Nothing-Wrong offers more negative 
reinforcement of the incorrect re- 
sponse than Right-Nothing, but Right- 
Nothing offers more positive rein- 
forcement of the correct response. 
Therefore the Right-Nothing and 
Nothing-Wrong groups should have 
generalization gradients whose slopes 
are approximately equal. 

It was found in the previous study 
(1) that the Right-Wrong and 
Nothing-Wrong generalization gra- 
dients were similar and that both 
were significantly steeper than the 
Right-Nothing generalization gradi- 
ent. Thus the predictions that would 
be made on the basis of the commonly 
accepted verbal reinforcement con- 
tinuum were incorrect. In an at- 
tempt to explain the results of the 
previous study, it was assumed that 
Nothing is a mild reinforcer. If this 
assumption were correct, Nothing- 
Wrong would approach Right-Wrong 
as a reinforcement combination, and 
their generalization gradients would 
have similar slopes. The Right- 





VERBAL REINFORCEMENT COMBINATIONS 289 


Nothing combination would offer less 
differential reinforcement (since 
Nothing would mildly reinforce the 
incorrect response), and a flatter 
generalization gradient would result. 

The critical element in this pro- 
posed explanation is not that Nothing 
is a mild reinforcer but that Nothing 
is closer to Right than it is to Wrong 
on the verbal reinforcement con- 
tinuum. The results of the previous 
study are accounted for just as well 
by assuming that Nothing is a non- 
reinforcer and that Right is a weaker 
positive reinforcer than Wrong is a 
negative reinforcer: 


Wrong ~ Nothing Right 


ae on a 


On the basis of this theoretical con- 
tinuum Nothing-Wrong approaches 
Right-Wrong, and Right-Nothing offers 
less differential reinforcement than 
either of the other combinations. 
Thus assuming that Nothing is a non- 
reinforcer and Wrong is a stronger 
negative reinforcer than Right is a 
positive reinforcer generates the same 
predictions as assuming that Nothing 
is a mild reinforcer. 

However, assuming that Nothing 
is a mild reinforcer would seem to 
conflict with known facts concerning 
extinction. In the typical extinction 
procedure every response is followed 
by nothing, which usually results in 
the diminution of previously acquired 
response tendencies. Such decrement 
in response tendencies would ostensi- 
bly not occur if Nothing were a mild 
reinforcer. Therefore we have ac- 
cepted the alternate formulation, i.e., 
Nothing is a nonreinforcer, and Right 
is a weaker positive reinforcer than 
Wrong is a negative reinforcer. Since 
these assumptions have been made 
post hoc, they must be checked out 


against new data. Accordingly, the 
present paper presents three ¢xperi- 
ments on verbal reinforcement com- 
binations. 

In the previous study (1) at the end 
of the acquisition series the Ss were 
told that E would no longer say any- 
thing. This instruction served to 
prevent extinction of the response 
tendencies just learned, despite the 
fact that the E no longer reinforced 
any responses. In the first of the 
present experiments this instruction 
was omitted, and the response tend- 
encies learned in acquisition were 
allowed to extinguish. 


EXPERIMENT | 


Method 


Subjects.—The Ss were @ patients in a 
neuropsychiatric hospital who were not receiving 
shock treatment, not disoriented, not diagnosed 
as having organic brain damage, and not mentally 
defective. The Ss were of both sexes, and they 
were randomly assigned to three groups of 20 
each, 

Material.—The stimuli were wooden blocks 
of various shapes, colors, areas, and heights. 
There were three shapes, five colors, three areas, 
and four heights. The heights were 2, 1, .5, and 
25in. Every stimulus was different from every 
other stimulus with respect to shape, color, 
height, or area. ‘The various shapes, colors, and 
areas were divided equally among the four 


heights. 
These stimuli differed from those used in the 
previous study (1). In the present study there 


were circular, square, and triangular shaped 
blocks, whereas in the previous study only 
circular blocks were used. ‘The variable of 
shape was added in order to make learning more 
difficult, which in turn permitted the length of 
the acquisition series to be increased. During 
the previous study it was found necessary to 
discard those Ss who perseverated on one re- 
sponse and therefore did not receive both aspects 
of the reinforcement combination, e. g., if all of 
S’s responses in the eight-trial acquisition series 
were incorrect, he would receive only the Wrong 
component of the Right-Wrong combination, 
Extending the learning series would diminish the 
probability of S's perseverating on one response 
throughout the entire series, and in fact it was 
not necessary to discard any of the Ss from the 





290 


present experiment. Furthermore, an extended 
acquisition series would permit a more detailed 
study of the learning process. Therefore the 
acquisition series was extended from 8 to 15 
trials, 

Procedure.—The E read the following in- 
structions: ‘“l am going to show you a series of 
wooden blocks, one at a time. Some of the 
blocks are Vec (V-E-C), and some are not Vee. 
You don’t know what a Vec is now, but it will 
become clear to you as we go. ‘To begin with 
you may have to guess.” Each stimulus was 
presented singly for about 5 sec. After each 
response E gave the appropriate verbal rein- 
forcement, and then the next stimulus was 
presented. The extinction series followed im- 
mediately after the acquisition series, and Ss 
were not told that there were two series. 

Design.—Series 1 (acquisition) consisted of 
15 stimuli, all of which were 2-in. high blocks. 
All Vee responses were correct, and all not Vec 
responses were incorrect. The verbal rein- 
forcement combinations given by E were: 
Group I--Right for a correct response, Wrong 
for an incorrect response; Group Il—Nothing 
for a correct response, Wrong for an incorrect 
response; Group II]—Right for a correct re- 
sponse, Nothing for an incorrect response. 

Series 2 (extinction-generalization) consisted 
of 40 stimuli: 10 each of 2-, 1-, .5-, and .25-in. 
high blocks. ‘The 2-in. stimuli used in extinction 
were different from the 2-in. stimuli used in 
acquisition. ‘The order of presentation was the 
same for all Ss; it was a random order with the 
limitation that no more than two successive 
stimuli were of the same height. The £& said 
nothing after every response. 


Results 


The measure of response strength 
used is the frequency of the Vec 
responses. Since the Vee response 
frequency is inversely proportional to 
the not Vec response frequency, the 
latter measure is superfluous. The 
acquisition and extinction data are 
presented in Fig. 1. During acqui- 
sition only 2-in. stimuli were pre- 
sented; the 15 trials were divided into 
blocks of five each in order to examine 
the learning curves. During the first 
block of five trials all three groups 
responded at just above chance (50%). 
The Right-Nothing acquisition curve 
levels off after the second block of 
trials, while the curves of the Right- 


A. H. BUSS, W. BRADEN, A. ORGEL, AND E. H. BUSS 











T , , r T 
' 2 3 2 ' 5 2 

LOCKS OF Five THALS HEIGHT OF STIMULI IN INCHES 

ACQUISITION (2 1 STIMUL) ONLY) CxTnCTION 


Fic. 1. 


Acquisition and extinction generaliza- 
tion in Exp. IL. 


Wrong and Nothing-Wrong groups 
show continued acceleration. All 
three groups manifested learning (in 
terms of positive slopes of their 
curves), but the slope of the Right- 
Nothing curve seems to be flatter 
than the other two curves. 

The significance of these trends was 
tested by an analysis of variance for 
repeated measurements on the same 
Ss (2), which is presented in Table 1. 
This analysis indicates that (a) the 
groups do not differ significantly in 
their over-all Vec response frequency ; 
(b) there are significant intertrial 
differences, i.e., among the three 
blocks of five trials each; and (c) the 
interaction between groups and trials 
is significant, i.e., the slopes of the 
acquisition curves are significantly 
different. 


TABLE 1 


Anatysis or Variance or Vcc Responses 1N 
tue Acguisinion Series or Expr. I 


Source | | F 


Between groups 


Between Ss in same group | 57 | 


2| 6.95) 2.31 
2.97 | 

2 | 27.70 | 41.33* 

2.85 | 4.25° 


Between Blocks of trials | 
Groups X Blocks of trials | 4) 
Pooled Ss K Blocks of trials) 114 67 | 
Total 179 





*P <O1. 





VERBAL REINFORCEMENT COMBINATIONS 


In the  extinction-generalization 
series there were 40 stimuli, 10 of each 
height. The right-hand side of Fig. 
1 shows the generalization gradients 
in the extinction series. The gradi- 
ents of the Right-Wrong and Nothing- 
Wrong groups are clearly higher than 
the gradient of the Right-Nothing 
group, but the slopes appear similar. 
The significance of these trends was 
tested by an analysis of variance, 
which is presented in Table 2. 

This analysis indicates that (a) the 
groups differ significantly in their 
over-all Vee response frequency, (b) 
there are significant differences in 
response frequencies to the various 
stimuli on the height continuum, and 
(c) the slopes of the generalization 
gradients are not significantly different. 

Each point on the generalization 
gradients during extinction represents 
the percentage of Vec responses made 
during 10 trials. It may be expected 
that during extinction there would be 
changes in Vec response strength. In 
order to discover any changes, the 10 
trials with each of the four stimuli 
were divided into two blocks of five 
each. These data are presented in 
Table 3. The changes from the first 
to the second block of trials were all 
small, and statistically nonsignificant 
(t test for correlated means), indi- 
cating that extinction was very slow. 
Also the generalization § gradients 


TABLE 2 


Anatysis or Variance or Vee Response 
FREQUENCIES IN THE EXTINCTION 
Series or Exp. I 


Source df MS i F 


Between groups 2/114 50 | 4.09* 
Between Ss in same group | 57 | 28.01} 
Between Stimuli 3) 15.33) 8.51%* 
Groups X Stimuli 6| 1.25| .69 
Pooled Ss & Stimuli | 171) 1.80] 

Total 239 } 


| 


*P < OS. 
—-P <M, 


291 


TABLE 3 


Percentace or Vee Responses tN THE 
Extinction Series or Exe. 1 


Right 
Nothing 
Trials 


| Right-Wrong | Nothing 
Trials 
| 


Wrong 


Height Trials 


Oo 
Stimuli 


tended to flatten slightly from the 
first to the second block of trials: 
Vec response frequency tended to 
drop for 2-, l-, and .5-in. high stimuli 
and increase for .25-in. high stimuli. 


Discussion 


We have assumed that on a continuum 
of verbal reinforcement Right is closer 
to Nothing than is Wrong, i.e., Right isa 
considerably weaker positive reinforcer 
than Wrong is a negative reinforcer. 
What follows from this assumption? 
First, to the extent that Right approaches 
Nothing as a reinforcer, the Right-Wrong 
and Nothing-Wrong combinations ap- 
proach identity. Thus, these two groups 
should have similar acquisition curves 
and similar generalization gradients in 
the extinction series. The data in Fig. 
1 conform to this expectation. Second, 
if Right is only a weak reinforcer, the 
Right-Nothing combination should result 
in slower learning, and the tendency to 
respond Vec should be weaker than in 
the other two combinations. These 
expectations were also borne out, as 
shown in Fig. 1. 

In contrast to the previous study (1), 
in the present study there were no differ 
ences among the slopes of the generali 
zation gradients, and these gradients 
were flatter than those of the previous 
study. These discrepancies in results 
probably stem from the different con 
ditions present in Series 2. In the 
previous study Series 2 was a test of 
generalization, and 
given to prevent 
present experiment 


instructions were 
extinction. In the 


Series 2 was an 





292 


extinction series in which generalization 
was tested. The extinction would be 
expected to flatten generalization gradi- 
ents (Table 3 shows that such flattening 
occurred), and such leveling of the 
gradients might obliterate any differences 
among the slopes. 

In the previous study the Right-Wrong 
generalization gradient was significantly 
higher than the Nothing-Wrong gradient, 
but this difference did not appear in the 
present study. We could find no obvious 
explanation for this discrepancy. It 
should be noted, however, that in the 
previous study the acquisition series was 
considerably shorter, and a number of 
Ss had to be discarded because they made 
all Vee or all not Vee responses. The 
shorter acquisition series, the discarding 
of Ss, or possibly the presence of extinc- 
tion in Series 2 of the present study might 
account for the difference in results. 
However, this is merely speculative, and 
the basic issue is whether these data are 
stable enough for us to theorize about 
the nature of the verbal reinforcement 
continuum, The remaining two experi- 
ments will provide data concerning the 
consistency of the results already 
obtained. 

In Exp. I the Right-Wrong and 
Nothing-Wrong acquisition curves had 
significantly steeper slopes and attained 
a considerably greater height than the 
Right-Nothing acquisition curve (see 
Fig. 1). While the acquisition series of 
Exp. I was almost double the length of 
the acquisition series of the previous 
study (1), it is possible that the number 
of trials was still too small. Perhaps 
with additional trials the Right-Nothing 
group would attain the same level of 
learning as did the other two groups. 
In Exp. II the acquisition series was 
extended to 60 trials in order to discover 
whether the differences in learning found 
in Exp. I would hold up in a longer 
learning series. 


Experiment II 
Method 


Subjects. —-The Ss were 30 neuropsychiatric 
patients who were selected in the same manner 


A. H. BUSS, W. BRADEN, A. ORGEL, AND E. H. BUSS 


as in Exp. I. They were assigned randomly 
to two groups of 15 each. 

Procedure.—The entire procedure, including 
the experimental design, was identical to that 
used in Exp. I, with the following exceptions. 
The acquisition series consisted of 60 stimuli, 
and there was no extinction series. Only the 
Right-Nothing and Nothing-Wrong reinforce- 
ment combinations were used, For the present 
purpose it was necessary to compare the Right- 
Nothing group with only one of the other two 
groups, since the latter were so similar in Exp. I. 
What is unusual about the similarity of the 
Right-Wrong and Nothing-Wrong groups in Exp. 
I is the performance of the Nothing-Wrong 
group. On the basis of the commonly accepted 
reinforcement continuum Nothing-Wrong should 
result in slower learning than Right-Wrong. ‘The 
Nothing-Wrong group performed better than 
was expected, learning at approximately the 
same rate as the Right-Wrong group and sig- 
nificantly faster than the Right-Nothing group. 
In order to determine whether the latter differ- 
ence would hold up in a new study, only the 
Right-Nothing and Nothing-Wrong combinations 
were used. 


Results 


The 60-trial acquisition series was 
divided into blocks of 10 trials. The 
Vec response frequencies of each group 
for the six blocks of trials are plotted 
in Fig. 2. The Nothing-Wrong curve 
rises sharply during the first 20 trials, 
but thereafter its slope is more gentle. 
The Right-Nothing curve manifests 
no rapid initial acceleration; it rises 


wo 


PERCENT OF VEC RESPONSES 








BLOCKS OF TEN TRIALS 


Fic. 2, Acquisition in Exp. II. 





VERBAL REINFORCEMENT COMBINATIONS 


TABLE 4 


Anatysis or VARIANCE oF Response 
FREQUENCIES IN THE ACQUISITION 
Series or Exp. Il 


4 \ us| FP 


Source 


Between groups 

Between Ss in same group | 28 | 29.29) 
Between Blocks of trials | 5 | 23.63 )12.18** 
Groups X Blocks of trials 5 | 25.15 |12.96%¢ 
Pooled Ss X Blocks of trials) 140 | 1.94) 

Total 179 


1 186.05 | 6.35* 


*P OS. 
* P Ol. 


slowly during the first 40 trials and 
then drops off slightly. The two 
curves differ considerably in over-all 
height, and the learning of the 
Nothing-Wrong group seems to be 
superior throughout the entire ac- 
quisition series. 

The significance of these trends 
was tested by an analysis of vari- 
ance, which is presented in Table 4. 
This analysis indicates that (a) there 
is a significant difference between 
groups in the number of Vec responses, 
(b) the frequency of Vee responses 
increases significantly among the 
blocks of trials, and (c) the slopes of 
the curves are significantly different. 


Discussion 


This experiment was designed to 
compare the learning of the Right- 
Nothing and Nothing-Wrong groups in 
a considerably enlarged acquisition 
series. The increased length of the 
acquisition series evidently had little 
effect on the learning of the two groups. 
The Nothing-Wrong group learned al- 
most as rapidly as its counterpart in 
Exp. I and significantly faster than the 
Right-Nothing group. The  Right- 
Nothing group learned slightly slower 
than its counterpart in Exp. I and 
seemed to derive little benefit from the 
additional acquisition trials. On the 
basis of the first two experiments it is 
concluded that the learning of the Righs- 
Nothing group is consistently inferior to 
that of the Nothing-Wrong group. 


293 


The reformulation of the theoretical 
verbal reinforcement continuum led to 
two predictions: (a) the Right-Wrong 
and Nothing-Wrong groups would learn 
faster than the Right-Nothing group, and 
(6) the Right-Wrong and Nothing-Wrong 
would manifest similar learning. The 
results of the first two experiments by 
and large confirm the first prediction. 
However, the similarity of the Right- 
Wrong and Nothing-Wrong groups is not 
so well established. In the previous 
study (1) the generalization gradients of 
these two groups had similar slopes but 
different heights, whereas in Exp. I of 
the present paper there was no difference 
in either slope or height. Since the issue 
is one of reliability of data, it was de- 
cided to replicate the learning of Exp. I. 

The extinction series of Exp. I con- 
sisted of 40 stimuli, 10 each of four 
different heights. For stimuli of a given 
height there were 10 extinction trials. 
With only 10 extinction trials it is 
possible that real differences between 
groups might be obscured. Therefore 
it was decided to use only 2-in. stimuli, 
and the extinction series was extended 
to 30 trials. 


Experiment III 
Method 


Subjects —The Ss were 40 neuropsychiatric 
patients who were assigned randomly to two 
groups of 20 each. 

Procedure.—The entire procedure, including 
the experimental design, was identical to that 
used in Exp. I, with the following exceptions. 
There were only two groups, Right-Wrong and 
Nothing-Wrong. The acquisition series con- 
sisted of 15 2-in. high stimuli, and there were 30 
2-in. high stimuli in the extinction series. The 
extinction stimuli were different from the ac 
quisition stimuli, 


Results 


Both the acquisition and extinction 
series were divided into blocks of five 
trials, and the Vee response fre- 
quencies were plotted. The curves 
are shown in Fig. 3. In acquisition 
both groups started responding at 
just above chance (50%). The slopes 





. BUSS, W. BRADEN, A. ORGEL, AND E, H. BUSS 


oe 
——- 











' 2 5 ' 2 
BLOCKS OF FiVE TRIALS 


ACQUISITION ExTinc TION 


Fic, 3, Acquisition and extinction in Exp. III. 
of the curves are similar, and at the 
end of the acquisition series the 
Nothing-Wrong group was responding 
at only a slightly higher level than the 
Right-Wrong group. An analysis of 
variance, presented in Table 5, reveals 
that the Vee response frequency of 
both groups increased significantly 
among the three blocks of acquisition 
trials. However, there was no sig- 
nificant difference between groups in 
either their over-all Vee response 
frequency or the slope of their learning 
curves. 

The Right-Wrong and the Nothing- 
Wrong acquisition curves bear the 
same felationship to each other in 
this experiment as they did in Exp. I 
(see Fig. 1). In both experiments the 
two groups start at about the same 
level, and at the end of acquisition the 
Nothing-Wrong curve is slightly 


TABLE 5 


Awarysts of Variance or Acquisition 
Data in Exe. Il 


Source 
Between groups 
Between Ss in same group 2.12 
Between Blocks of trials 12.93 | 17.47*** 
Groups X Blocks of trials) 25) .34 
Pooled Ss X Blocks of 
trials | 74 








Total | 





P< O01, 


higher than the Right-Wrong curve. 
In both experiments there is no 
significant difference in the over-all 
height or the slope of the learning 
curves of the two groups. 

At the start of the extinction series 
both groups were responding at a 
higher level than at the end of 
acquisition. Extinction was slow for 
both groups, and a fairly high level of 
responding was maintained through- 
out the 30 extinction trials, although 
the slope of the Right-Wrong curve 
appears to be slightly steeper than the 
slope of the Nothing-Wrong curve. 
An analysis of variance, presented in 
Table 6, reveals that the slopes of the 
two curves are significantly different. 
However, the differences between 
groups in over-all Vec response tend- 
ency and the difference among trials 
were not significant. 

The steeper slope of the Right- 
Wrong extinction curve is not sur- 
prising. In the acquisition series of 
the Right-Wrong group a Vee was . 
followed by Right, whereas in extinc- 
tion a Vec response was followed by 
Nothing. The change from Right in 
acquisition to Nothing in extinction 
should diminish the tendency to 
respond Vec. On the other hand in 
both the acquisition and extinction 
series of the Nothing-Wrong group 
the Vec response was followed by 
Nothing. This similarity of condi- 
tions in acquisition and extinction 


TABLE 6 


Anatysis or Variance or Extincrion 
Data in Exp. III 


Source 


Between groups 
Between Ss in same group 
Between Blocks of trials 
Groups X Blocks of trials 
Pooled Ss K Blocks of trials 
Total 


*P < OS. 





VERBAL REINFORCEMENT COMBINATIONS 


would be expected to retard extinction 
of the Vec response and make the 
Nothing-Wrong extinction curve flat- 
ter than the Right-Wrong curve. 


Discussion 


The three experiments yielded three 
major findings that must be accounted 
for by a hypothetical reinforcement 
continuum: (a) the Right-Wrong and 
Nothing-Wrong combinations have simi- 
lar effects; (4) the Right-Nothing com- 
bination results in slower learning than 
the other two combinations; and (c) 
learning acquired with the Right-Wrong 
combination extinguishes faster than 
learning with Nothing-Wrong. 

Heretofore it has been assumed that 
Right and Wrong have equal and opposite 
effects, with Nothing being a nonrein- 
forcer. This accepted re- 
inforcement continuum would generate 
the prediction that the Right-Nothing 
and Nothing-Wrong combinations would 
have similar effects and that both com- 
binations would result in slower learning 
than Right-Wrong. This prediction is 
clearly at variance with the findings of 
the three experiments, and it seems that 
the commonly accepted reinforcement 
continuum cannot account for these data. 

The reinforcement continuum that is 


commonly 


proposed was suggested by the results 
of an earlier study (1). It is assumed 
that Nothing is a nonreinforcer, Right is 
a weak positive reinforcer, and Wrong is 
Thus the 
critical component of the Right-Wrong 
and Nothing-Wrong 
Wrong, 


a strong negative reinforcer. 


combinations is 


and these two combinations 
should have similar effects. If Right is 
a weak the Right-Nothing 
combination result in slower 
Thus the 


reinforce- 
accounts for the first 


reinforcer, 
should 
learning than the other two. 
proposed theoretical 
ment continuum 
two findings. 
The third finding was the significantly 
slower extinction of the Nothing-Wrong 
group as compared to the Right-Wrong 
group. The lack in the 
Nothing-Wrong group is accounted for 


verbal 


of extinction 


295 


in terms of the similarity of the con- 
ditions of acquisition and extinction. 

The discrepancy in findings between 
Exp. I and an earlier study (1) led to a 
question regarding the consistency of 
results. The second and third experi- 
ments reported here bear on this issue. 
The finding in Exp. I that Nothing- 
Wrong results in faster learning than 
Right-Nothing was duplicated in Exp. 
II]. The finding in Exp. I that Nothing- 
Wrong and Right-Wrong combinations 
yield similar learning curves was con- 
firmed in Exp. III. Thus, while the 
findings of a previous study were not 
duplicated, the consistencies in the 
present three experiments lead to the 
conclusion that the findings are reliable 
enough to be used in support of the stated 
theoretical position. 


SUMMARY 


Three experiments were designed to evaluate 


a theoretical verbal reinforcement continuum 
which assumes that Wrong is a stronger negative 
reinforcer than Right is a positive reinforcer. 
Psychiatric patients learned to make a desig 
nated response to a series of wooden t#6tks 
presented singly. After each response FE pre 
sented one of three possible combinations of 
verbal reinforcement: (a) Right for a correct 
response, Wrong for an incorrect response ( Right 
Wrong); (b) Nothing for a 


Wrong for an 


correct response, 
(Nothing 
Wrong) ; (c) Right for a correct response, Nothing 
for an incorrect response ( Right-Nothing). 

The following facts emerged from the three 
experiments: (a) Right-Nothing leads to slower 
learning than do the other two combinations; 
and (b) Right-Wrong and Nothing-Wrong have 
similar acquisition curves and both extinguish 
slowly, but Right-Wrong leads to faster exting 
tion than Nothing-Wrong 


incorrect response 


These findings were 
interpreted as being consistent with the proposed 
theoretical verbal reinforcement continuum. 


REFERENCES 


1. Buss, A. H., Wiewer, M., @ Buss, F 
Stimulus generalization as a function of 
verbal reinforcement combinations. /. 
exp. Prychol., 1954, 48, 433-44 

2. Eowarps, A. L 
psychological 
Rinehart, 1950 


Experimental design in 
research, New York: 


(Received September 26, 1955) 





Vol. 82, Mo. 6 1986 8 nee? 


THE RELATION OF ANXIETY (DRIVE) LEVEL TO 
PERFORMANCE IN COMPETITIONAL AND NON 
COMPETITIONAL PAIRED-ASSOCIATES 
LEARNING! 


K, W. SPENCE, I. E. FARBER, AND H. H. McFANN? 


State University of lowa 


Conditioning studies involving some 
form of noxious stimulation have 
revealed that level of performance is 
a function of the intensity of the 
unconditioned stimulus (13, 17). 
One interpretation that has been 
given of this finding is that the more 
noxious the stimulus the higher is the 
level of the emotional response (state 
of emotionality) of S (21,22). Level 
of emotionality, in turn, is one of the 
factors assumed to determine the 
total effective drive level of the 
organism. This concept of drive 
level or D is one of the important 
intervening variables determining re- 
sponse strength in S-R theory. 

Another line of evidence indicating 
that noxious stimulation and its after- 
effects determine level of response are 
the studies (3, 15, 16) which have 
shown that the level of consummatory 
response (eating, drinking) is signifi- 
cantly increased for a period of time 
if Ss are shocked just prior to being 
placed in the food or water situation. 
These investigators have interpreted 
their findings as reflecting the per- 
severation of the emotional state 
produced by the preceding shocks, 
which is assumed to increase response 
strength through increase in level of 


dD. 


' This study was carried out as part of a 
project conducted under contract N9 onr-93802, 
Project 154-107 between the State University 
of Iowa and the Office of Naval Research. A 
portion of the data in the first experiment was 
collected by Rhoda Ketchel. 

*Now at Human Research Unit No. 3, 
OCAFF Fort Benning, Georgia. 


Similar motivational properties 
have been demonstrated in the case 
of non-noxious stimuli which, in the 
previous history of S, have been 
associated with a noxious stimulus. 
Mowrer (12) and Miller (10) have 
assumed that such prior training 
establishes a conditioned emotional 
(fear) response to the previously 
neutral stimulus. Studies such as 
those of Amsel (1), Kalish (8), and 
Brown, Kalish, and Farber (4) have 
demonstrated that the presence of 
these conditioned fear arousing stimuli 
can intensify coincident stimulus- 
response tendencies. 

Accepting the notion that the 
degree of emotionality of S, produced 
either by unconditioned or condi- 
tioned stimuli, affects ievel of re- 
sponse, and interpreting this effect 
within the framework of our theo- 
retical system as reflecting level of D, 
a series of experiments was initiated 
a number of years ago which at- 
tempted to manipulate degree of 
emotionality in a quite different 
manner (20, 21, 22, 23). In the first 
of these studies (23) a test was 
developed which was aimed at dif- 
ferentiating Ss in terms of the degree 
to which they admitted having overt 
symptoms of emotionality. The test 
was in the form of a personality 
inventory, the items of which were 
judged by clinical psychologists to 
differentiate persons in terms of their 
emotional responsiveness. Unfortu- 
nately, the scale was called a test of 
“manifest anxiety,” which has led to 


296 





PAIRED-ASSOCIATES LEARNING 


all manner of investigations designed 
to ascertain whether it is a valid test 
of real anxiety! We shall continue 
to refer to it as an anxiety scale (A 
scale) but with no assumption other 
than that it differentiates degrees of 
emotional responsiveness and level of 
D 

Turning now to the role of drive in 
learning situations, the effect of vari- 
ations in the level of D will, according 
to the theory, depend upon the nature 
of the learning ‘task. As has been 
pointed out on a number of occasions 
(18, 19), the implications of a theory 
are a joint function of the laws or 
hypothetical relations postulated in 
the theory and what are called the 
initial or boundary conditions of the 
behavior situation. In simple clas- 
sical conditioning, in which there is 
but a single response tendency, an 
increase in the strength of D results 
in a higher level of FE, and hence 


implies a stronger response (R = f(E) 


=f(l1XD). In more complex 
learning situations involving a hier- 
archy of competing responses, how- 
ever, the effect of drive level variation 
will depend upon this initial response 


3 One sort of criticism of our experiments has 
revealed a serious misunderstanding of their 
purpose and underlying logic. It is that since 
there is not independent evidence that the test 
really measures emotionality, and there is 
evidence that the test scores correlate with other 
personality indices, it cannot legitimately be 
assumed that differences on the test reflect 
differences in drive level (D). To repeat again 
the reasoning of these experiments, the hypothesis 
is set up that the test scores reflect differences in 
emotionality and hence ‘differences in D. This 
hypothesis is then tested by deriving, with the 
aid of other parts of the theory of learning, 
implications concerning differences to be ex- 
pected in conditioning and various other types 
of learning situations. Confirmation of these 
deductions lends support to the theory, including 
the hypothesis about the relation of the anxiety 
scale scores to D. Obviously they don’t prove 
the theory, just as any theory is never proved in 
science. 


297 


hierarchy and the relative position in 
it of the response that is to be learned. 

In general, the greater the number 
and strength of the competing, incor- 
rect responses relative to the correct 
response, the more detrimental should 
a high drive be to performance 
level, at least in the early stages of 
learning. Making use of the known 
fact that anticipatory and persevera- 
tive tendencies in serial learning 
produce strong competing response 
tendencies, a test of this implication 
has been made in three studies, one 
involving a verbal maze (24), one a 
stylus maze (6) and one rote serial 
learning (11). All three experiments 
provided evidence supporting the 
implication that the high-anxious Ss 
would be inferior to Ss scoring at the 
low end of the scale. 

In these serial learning experiments, 
however, one has little or no knowl- 
edge of the relative strength of the 
correct and incorrect S-R tendencies. 
On the assumption that the incorrect, 
competing responses are based on 
theoretical remote associations or 
generalized response tendencies, it is 
possible, as Montague (11) did, to 
vary the similarity of the nonsense 
syllables employed, and thus to ma- 
nipulate, theoretically, the strength 
and number of competing S-R ten- 
dencies. We were interested, however, 
in designing a learning situation in 
which it would be possible to manipu- 
late in some better known manner the 
strengths of both the correct and the 
competing, incorrect S-R tendencies. 
Minimization of the latter would 
provide a situation in which Ss with 
high drive level would be expected to 
perform better than those with a low 
drive, whereas if we maximized the 
relative strengths and number of 
competing, incorrect S-R tendencies, 
the opposite result should obtain. 
The present study describes two such 





298 


learning situations and presents the 
findings of a separate experiment with 
each. 


Tueoreticat ANALYsis oF Patrep- 
Associates LEARNING 


The situation selected for the experi- 
ments was paired-associates learning. 
In this type of learning situation, § is 
required to learn to respond to a stimulus 
word or nonsense syllable by anticipating 
a paired response syllable or word. By 
using different orders of presentation of 
the paired words the development of 
remote associations, sO prominent in 
serial learning, is minimized. 

Paired-associates learning may be 
conceived as consisting of a set or series 
of more or less isolated S-R associations 
or habit tendencies (S; — Ra, So — Ra, 
S; — Re, etc.) that become established 
as a consequence of the training pro- 
cedure. Theoretically, if these stimulus- 
response items were entirely isolated 
from one another so that the only ex- 
istent associative tendencies were be- 
tween each stimulus word and its own 
paired response word, then Ss _ with 
relatively high drive would be expected 
to perform at a higher level in learning 
such a series than Ss with a lower drive 
strength. Essentially, the situation is 
similar to that of classical conditioning, 
except that instead of one S-R tendency 
being conditioned, a number of different 
S-R tendencies are being established 
simultaneously. While it may not be 
possible to obtain complete isolation 
among the S-R items, it is known how, 
on the basis of existing experimental 
knowledge, to approach this limiting 
condition with its minimal competition 
among S-Rs. Similarly, it is known how 
to vary the conditions so as to increase 
the amount of competition among then. 

One of the most important factors 
determining the degree of isolation of the 
paired S-Rs is that of generalization, 
which, in turn, is a function of the degree 
of synonymity and/or formal similarity 
among the stimulus and response words. 
If this factor is minimal, there will be 
little or no generalized tendency for S, 


K. S. SPENCE, I. E. FARBER, AND H. H. McFANN 


to elicit other responses than Ra, S: to 
elicit responses other than Rag, etc. 

A second factor that enters into such 
paired-associates learning is the strength 
of the associative connection between 
any stimulus word and any response 
word. As the result of past experience, 
words tend to become associated with 
other words to varying degrees, and for 
each word the hierarchy of associative 
strengths tends to be similar for indi- 
viduals in the same culture. Such 
differences in the strength of associative 
connections between words in a language 
are exemplified by the word association 
data of Kent and Rosanoff (9). 

It is readily apparent that one may 
also take advantage of this factor to 
control not only the extent to which each 
stimulus word will tend to elicit its own 
paired word but also will tend to elicit 
response words other than the one with 
which it is paired. Thus, we could pair 
each stimulus word with a response word 
with which it tends, as the result of past 
verbal experiences, to be highly as- 


sociated and, at the same time, make sure 


that the associative connections between 
each stimulus word and each of the non- 
paired response words are low or non- 
existent. Such a condition would ob- 
viously help further to minimize the 
likelihood of competing response ten- 
dencies of any appreciable strength for 
each stimulus-response pair. A list of 
paired associates in which the paired 
words have high initial associative con- 
nections and in which the degree of 
synonymity of the stimulus and response 
words is minimal would thus provide a 
learning situation in which high-drive 
(high-anxious) Ss should perform at a 
higher level than low-drive (low-anxious) 
Ss. 

Contrariwise, we may construct a 
paired-associates list with a high amount 
of competition in which the opposite 
finding should occur; that is, the high- 
anxious Ss should perform more poorly 
than the low-anxious. There are a 
number of different ways in which such 
competition may be introduced, one of 
which will be described here. 

Beginning with four stimulus-response 





PAIRED-ASSOCIATES LEARNING 


pairs having high associative connec- 
tions, the remaining eight pairs are 
formed as follows. For each of the four 
original stimulus words two synonymous 
stimulus words are selected and paired 
with response words with which they 
have little or no associative strength. 
Thus, for each triad of synonymous 
stimulus words, two are paired with 
response words with which they are 
weakly associated, if at all, and one is 
paired with a highly associated word, as 
follows: 


Ra (strong) 
nd Rp (weak) 
----* Re (weak) 


The stimulus words S,’ and S,”, being 
highly synonymous with S,, also have a 
high initial associative connection with 
Ra. As a result, the learning of the 
pairs involving these stimuli, 1.e., 
SS,’ — Rp and S,” — Re would involve a 
strong competing response tendency, 
one, in fact, that is stronger than that 
to its paired response. In the case of 
these paired words, then, we would 
expect the anxious Ss to be poorer than 
the nonanxious. 

The implications of the theory with 
respect to the relative performance of 
high and low drive Ss on the four 
stimulus-response pairs of the list that 
have strong original connections (e.g., 
S: — Ra) are more involved. At the 
very beginning of learning the perform- 
ance of the high-drive Ss should be 
superior to that of low-drive Ss, just as 
in the case of the first, noncompetitional 
list. If properly chosen, these stimulus 
words should have little if any initial 
associative tendencies to Rp or Re. 
However, once Ss begin to learn the 
other pairs, (e.g., S;’ — Rp, S$,” — Re) 
there should develop a generalized habit 
for S; to evoke Ry and Re (principle of 
generalization or habit 
strength). Since the excitatory potential 
(E) from S, to these responses (Rp and 
Re) would reach super-threshold values 
sooner for the high-drive group than for 


of associative 


299 


the low-drive group we should expect 
these responses to intrude or block the 
correct response (Rq) earlier (and more 
frequently) in the case of the high-drive 
group. Thus, we would be led to pre- 
dict that the initial superiority of the 
high-drive group on the strongly asso- 
ciated pairs should tend to disappear 
during training. Evidence with respect 
to these theoretical expectations was 
sought in the following experiments. 


ExperiMent | 


Since all of our previous experi- 
mental studies with verbal learning 
had involved comparison of high- 
and low-anxious Ss, in situations in 
which there were strong competing 
responses, we were interested, first, 
in testing the prediction that a non- 
competitive verbal learning situation 
would reveal a superior performance 
on the part of high-anxious Ss, as has 
been found in the case of simple 
classical conditioning. Accordingly, 
Exp. I involved a paired-associates 
list in which there was a minimum of 
competition among the paired words 
and in which the associative connec- 


tions between the paired words were 
initially high. 


Method 


Subjects —The Ss were 20 men and 20 
women enrolled in an introductory psychology 
course, an equal number of each sex having 
scored in either the upper 20% or lower 20% of 
scores on the A scale. All were 
respect to the experimental task. 

A pparatus.—A Hull-type memory drum was 
employed to present the lists of paired-associates 
learning material. The stimulus 
items of each list were exposed every 4 sec., 
including a 1.67-sec. anticipation interval, with 
a 4-sec. rest interval between successive pres 
entations of a list. The practice list (15 paired 
nouns) was used to acquaint S with the pro 
cedure and to provide maximal and minimal 
performance criteria. The test list, shown in 
Table 1, consisted of 15 pairs of two-syllable 
adjectives from Haagen’s word list (7), and was 
constructed in such a manner as to maximize 


naive with 


successive 





300 


TABLE 1 


Noncompetitive ano Competitive Test Lists 
Usep iw Exe. I ann Il 


| 
Noncompetitive: Exp. 1 | Competitive: Exp. Il 
| 


Ke sponse | Stimulus 


Stimulus Response 
Fruitless 
Grouchy 
Leading 
Minute 
Yonder 
Wholesome 
Nomad 
Opaque 


Agile 
Placid 


Double 
Headstrong 


Skilful 
Fruitless 
Thorough 
Remote 
Vacant 
Arctic 
Crazy 
Minute 
Oversize 
Devout 
Nomad 
Headstrong 
jet 


Adept 
Barren 
Complete 
Distant 
Empty 
Frigid 


*Barren 
Arid 


Undersized 
*Roving 
Gypsy 
Migrant 
*Tranquil 
Quiet 


Serene 


Insane 
Little 
Mammoth 
Pious 
Roving 
Stubborn 
Tranquil 
Urgent ressing 
Wicked Evil 














*S.K terms in the competitive test list that were 
taken from the noncompetitive list of Exp. !. 


closeness (strength) of association between 
paired stimulus-response words. Meaningful 
intralist associations and formal similarities were 
minimized. ‘Thus, no beginning letter or suffix 
was repeated within the stimulus or response 
list and no stimulus-response pair began with 
the same letter or had the same suffix. Both 
lists were presented to S in three different orders 
to avoid serial learning. 

Procedure.—All Ss served individually under 
the same experimental conditions. Immediately 
following the reading of the instructions de- 
scribing the method of learning, S received six 
trials on the practice list followed by a 2-min. 
rest period. During this rest period S was 
moved to a seat before the screen containing the 
drum with the test list. Following this interval, 
S was run to a criterion of two successive perfect 
trials on the test list. 

On each trial, correct anticipations, errors, 
and overt intrusions were recorded. An error 
consisted in either making no response or an 
incorrect response (an overt intrusion) during 
the anticipation interval, 

The Ss were discarded on the basis of their 
scores on the practice list if they failed to make 
a single correct response, or if they made 50 or 
more correct responses during the six practice 
trials. Only one S was discarded on the basis 
of these minimal and maximal performance 
criteria, and he was replaced by another. 


Results 


The mean number of correct antici- 
pations made on the practice list was 


K. S. SPENCE, I. E. FARBER, AND H. H. McFANN 


14.0 for the high-anxious group and 
13.7 for the low-anxious group. These 
values are to be compared with a mean 
of 14.7 for a more extensive sample of 
267 high-anxious Ss that have been 
run on the same list and a mean of 
13.8 for a sample of 255 low-anxious 
Ss. Thus, it will be seen that the 
difference between the present samples 
in favor of the high group is somewhat 
smaller than in the more extensive 
samples. 

Learning curves on the test list for 
the high- and low-anxious groups in 
terms of the mean percentage of cor- 
rect anticipations made on Trials 2—11 
are presented in Fig. 1. As may be 
seen, the curves rise rapidly, with that 
for the high-anxious group starting 
and remaining consistently above the 
curve for the low group. 

Data on learning in terms of errors 
and trials to the criterion of mastery 
are presented in Table 2. It will be 
observed that the high-anxious Ss 
were superior to the low-anxious Ss 
in the case of both measures. The 
results of an analysis of variance gave 
Fs for the anxiety variable which were 
significant at the .O1 level for the 
trial measure and the .05 level in the 


o——© WIGH ANXIETY (N+ 20) 


@-——@ LOW ANKIETY (N*20) 








Fic. 1. Paired-associates learning as a 
function of anxiety under conditions of minimal 
interpair competition and high initial stimulus- 


response associative strength. 





PAIRED-ASSOCIATES LEARNING 


TABLE 2 
PERFORMANCE ON NONCOMPETITIVE 


Test List 


Errors 


Group 


| | j 
| Mean | SD | Mean SD 


High-anxious | 20 | 8.95 | 2.75 | 20.95 | 10.49 
Low-anxious 20 | 12.60 | 4.67 | 32.50 | 20.91 





case of the error measure. In both 
instances the Anxiety X Sex inter- 
action was less than one, indicating 
that the difference between high- and 
low-anxious Ss held for both sexes. 


Experiment II 


In contrast to Exp. I, a partion of 
the list of paired associates used in 
Exp. II involved learning in which 
competing response tendencies ini- 
tially stronger than the correct re- 
sponses were present. Our theory 
would lead us to expect that the high- 
anxious Ss would perform more poorly 
than the low-anxious Ss on these 
paired associates. 


Method 


Subjects.—The Ss were all men, 10 of whom 
scored in the lowest 20% of scores on the A 
scale and 9 of whom were above the 80th per- 
centile. ‘Three additional Ss failed :o meet the 
criteria established for the learning of the 
practice list and were discarded. 

Apparatus and procedure.—The memory 
drum, instructions, and practice list were 
exactly the same as those used in the first ex- 
periment. Likewise, the procedure was identi- 
cal, the Ss first receiving six trials on the practice 
list and then being shifted to the test list, which 
they were required to learn to a criterion of two 
successive perfect trials. 

The test list of paired adjectives employed 
in this experiment is given in Table 1. As may 
be seen, it consisted in part of four paired 
adjectives (marked by an asterisk) based on the 
test list of Exp. I. The associative connections 
between the words of these pairs were very 
high. For each of the stimulus words of these 


301 


four pairs two synonymous adjectives were 
selected as stimulus words by means of Haagen’s 
study. Each of these eight stimulus words was 
paired with an adjective with which it had little 
or no associative connection. The data for 
these two different kinds of paired associates 
(those with high and those with low associative 
connections) were treated separately, since the 
theoretical predictions with respect to them 


differ. 


Results 


The high-anxious Ss averaged 15.8 
correct anticipations on the practice 
lists as compared with 14.2 for the 
low Ss. This difference in favor of 
the high group was somewhat larger 
than that for the more extensive 
samples (see Results section for Exp. 
I). The difference is not, however, a 
significant one. 

Figure 2 presents learning curves 
for the high and low groups in terms 
of the percentage of correct antici- 
pations made on successive pairs of 


3 


& 


HIGH ASSOC - LOW ANK 


& 


HIGH ASSOC - HIGH ANX 


8 


e@-—-—0 LOW ASSOC. ~ LOW ANK 


MEAN PERCENT CORRECT RESPONSES 
8 g 


° -@ LOWASSOC. - HIGH ANX 








23 4-5 ET OF ii IBIS 48 OT O19 BOG! 2203 2605 
PAIRS OF TRIALS 


Fic. 2 


function of anxiety 


Paired-associates learning as a 
under conditions of high 


Word pairs of both high- 


value 


interpair competition. 


and low-association were interspersed 
within the same training list, but were analyzed 


separately. 





302 


trials. The two lower curves repre- 
sent the performance on the eight 
weakly associated word pairs that 
had strong competing responses, while 
the two upper curves are for the four 
pairs in which the words were initially 
highly associated. 

As our theory predicted, the per- 
formance curve for the highly anxious 
Ss was below that of the low-anxious 
Ss in the case of the eight word pairs 
that involved competition. However, 
the difference in number of errors for 
Trials 2-23 was not significant (t 
= 1.56). In accord with the de- 
duction concerning the four stimulus- 
response pairs of the list that had high 
associative connections, we find, as 
predicted, that the performance of the 
high-anxious Ss was initially superior 
to that of the low-anxious Ss, although 
the difference was very slight, and, 
also, that there was a reversal later 
in the learning. It should be noted, 
further, that the differences between 
the two groups of Ss for the two types 
of paired associates were opposite in 
nature at the beginning of learning. 
Thus, the high-anxious Ss did better 
than the low-anxious Ss on the four 
word pairs involving no competition 
at the same time that they were doing 
more poorly on the eight word pairs 
involving competition. 

A final set of data pertains to the 
number of trials required to learn the 
total list. This measure was, of 
course, determined primarily by the 
eight difficult word pairs involving 
competition. The mean for the low- 
anxious group was 18.4, with that for 
the high group being 23.3. The 
difference was significant at the .05 
level (¢ = 2.48). ‘Thus we see that, 
whereas the high-anxious Ss showed 
the superior performance in Exp. I, 
the low-anxious Ss were superior in 


Exp. I. 


K. S. SPENCE, I. E. FARBER, AND H. H. McFANN 


Discussion 


From a theoretical standpoint the 
most interesting finding of this investi- 
gation is that the high-anxious Ss per- 
formed in a superior manner to the low- 
anxious Ss in Exp. I. In our previous 
studies that have involved learning 
situations more complex than classical 
conditioning (e.g., 7, 28), high-anxious 
Ss performed more poorly than low- 
anxious Ss. We ascribed these results 
to the presence of strong competing 
responses (anticipatory and persevera- 
tive tendencies) that develop in serial 
learning tasks. In the first paired- 
associates task reported here (Exp. 1) 
such competing responses were mini- 
mized by controlling for generalization 
as described in the introductory section. 
The fact that the correct responses had 
high initial associative connections with 
their respective stimuli also assured a 
greater initial differential, with higher 
drive strength, between the excitatory 
strengths of the correct responses and the 
excitatory strengths of any incorrect, 
competing responses, i.e., Ey, — E_ 
= D(H, — E_). The fact that the 
high-anxious Ss were superior right from 
the start is in agreement with our 
analysis. Furthermore, it may be pre- 
dicted that if the associative connections 
between the stimulus and response items 
were low or nonexistent at the beginning 
of training (and providing competition 
were minimized by the methods de- 
scribed), there would be no initial dif- 
ference between anxious and nonanxious 
groups, but one would develop in favor 
of the anxious Ss as learning progressed. 

On the other hand, when strong re- 
sponse tendencies in competition with 
the correct response were provided by 
means of the methods used in Exp. II, 
this advantage of high- over low-anxious 
Ss in paired-associates learning dis- 
appeared and the low-anxious Ss actually 
required significantly fewer trials (P 
= .05) tolearn than did the high-anxious 
Ss. 

One final series of comments concerns 
the interpretation of these studies re- 
lating anxiety to learning thac has been 





PAIRED-ASSOCIATES LEARNING 


offered by Child (5). Child would 
explain the inferior performance of 
anxious Ss in situations involving com- 
peting responses elicited by the task 
stimuli in terms of task-irrelevant re- 
sponses made to the anxiety, i.e., 
irrelevant responses that interfere with 
performance in the task. Although 
Child has expressed the view that our 
interpretation had overlooked the role 
of such responses, we were actually well 
aware of such a possibility, and have for 
some time interested in the role 
of such task-interfering responses, which 
we think of as being elicited by the drive 
stimuli (sp) resulting from the emotional 
(drive) state.‘ 

That such distracting, task-interfering 
responses will under certain conditions 
occur we have no doubt. One of the real 
difficulties is to know when and to what 
extent they function. From our point 
of view they are a nuisance, in the sense 
of a difficult-to-control factor that acts 
to obscure the role of D in competing 
response situations. Accordingly, with 
our primary interest in these studies 
being in the role of D rather than sp, we 
have deliberately attempted to employ 
conditions in which such interfering 
responses would be at a minimum. So 
far as we have been able to observe, our 
high-anxious Ss have not tended to 
engage in distracting irrelevant activities 
to any greater than our low- 
Possibly the reason for this 
is that our experimental situations have 
not been so stressful as to provide the 
degree of emotionality that would elicit 
much of this kind of behavior. 

The findings of Ramond’s study (14) 
are of some interest in this connection. 
He employed a choice-learning situation 
in which S had to learn to choose one of 
two alternative response words for each 
of 16 stimulus words. In half of the 
items the associative connection of the 


been 


extent 
anxious Ss. 


‘In this connection attention is called to the 
fact that the series of studies by Amsel and his 
colleagues (1, 2, 3) which have been concerned 
with investigating the differential effects of D 
and the interfering responses elicited by sp 
originated in this laboratory. 


303 


correct response word was stronger than 
that of the incorrect response word, and 
in the other half, the incorrect response 
was the stronger. It found that 
under the condition in which the in- 
correct response was stronger, the anx- 
ious Ss did significantly worse than the 
nonanxious Ss, but under the reverse 
condition there was not a significant 
difference in over-all performance, al 
though the anxious Ss started out better 
and subsequently became poorer than 
the nonanxious Ss in the later portion 
of the learning. the task-inter- 
fering behavior, if there was any, would 
presumably be equal for the two kinds 
of learning items, which were intermixed 
with each other in the list, the relatively 
inferior performance of the anxious Ss 
with one set of items must be accounted 
for by some other factor than distracting, 
task-interfering responses. Our ex- 
planation would be that the greater 
drive level of the anxious Ss increased 
the unfavorable difference in the com- 
peting excitatory potentials in the 
direction of the incorrect responses and 
thus led to a greater likelihood of occur 
rence of such erroneous responses. 

It is interesting to speculate in con- 
nection with Ramond’s findings that 
both mechanisms (D and sp) were 
operative, the two acting jointly to 
lower the performance of the anxious Ss 
relative to that of the nonanxious Ss in 
the case of the items in which the in- 
correct response was the stronger, while 
their effects were opposed in the case of 
the other type of item. Thus, whereas 
higher D would tend to give an advan- 
tage to the anxious Ss in the case in 
which the correct the 
stronger, interfering elicited 
by the cue aspects of anxiety would 
favor the nonanxious Ss. If this inter- 
pretation is correct, we see that the 
effects of the interfering activities must 
have greater as the 
proceeded. 

In concluding, should be 
directed to the that Child’s 
theorizing is not opposed to ours. Both 
operate within the framework of Hullian 
S-R theory. Our have 


was 


Since 


response was 


responses 


become learning 


attention 
point 


experiments 





304 K. $8. SPENCE, I. E. FARBER, AND H. H. McFANN 


merely been somewhat more restricted 
in interest, being mainly centered on the 
role of D in determining behavior, 
rather than in the other possible func- 
tions of anxiety, including its drive cue 
(sp) aspects. 


SUMMARY 


On the basis of the assumption that the A 
scale measures degree of emotionality and, 
hence, level of D, and the further assumption 
that the effect of variations in the level of D 
upon performance in learning depends upon the 
position in the response hierarchy of the re- 
sponses to be learned, different predictions were 
made concerning the relative performance of 
high- and low-anxious Ss in two different verbal 
learning situations. In the case of a list of 
paired associates having a minimum of generali- 
zation among the S-R pairs, and, therefore, 
little competition among responses, it was pre- 
dicted that highly anxious Ss would perform 
better than nonanxious Ss. In the case of a 
list in which competing, incorrect responses 
could be expected to be stronger than correct 
responses, it was predicted that highly anxious 
Ss would perform more poorly than nonanxious 
Ss. 

In Exp. I, using a noncompetitive list, the 
anxious Ss made significantly fewer errors and 
required significantly fewer trials to reach the 
learning criterion than did the nonanxious Ss. 
In Exp. II, using a list mainly composed of 
highly competitive items, anxious Ss required 
significantly more trials to reach the criterion. 

The necessity of minimizing the possible 
confounding effects of responses elicited by the 
drive stimuli (sp) resulting from emotionality 
when one studies the effects of drive level (D) 
upon learning performance is _ strongly 
emphasized, 


REFERENCES 


1. Amsert, A. The combinaiion of a primary 
appetitional need with primary and 
secondary emotionally derived needs. /. 
exp. Psychol., 1950, 40, 1-14. 

2. Amser, A., & Core, K. F. Generalization 
of fear-motivated interference with water 
intake. J. exp. Psychol., 1953, 46, 243- 
247. 

3. Ameer, A., & Mauraman, I. The effect 
upon generalized drive strength of 
emotionality as inferred from the level 
of consummatory response. J. exp. 
Psychol., 1950, 40, 563-569. 


4. Baown, J.S., Karisn, H. 1, & Farser, I. E. 
Conditioned fear as revealed by magni- 
tude of startle response to an auditory 
stimulus. J. exp. Psychol., 1951, 41, 
317-328. 

. Cup, LL. Personality. 
Psychol., 1954, 5, 149-171. 

. Fanper, I. E., & Spence, K. W. Complex 
learning and conditioning as a function 
of anxiety. J. exp. Psychol., 1953, 45, 
120-125. 

. Haacen, C. H. Synonymity, vividness, 
familiarity, and association value ratings 
of 400 pairs of common adjectives. /. 
Psychol., 1949, 27, 453-463. 

. Kauisu,H.1. Strength of fear as a function 
of the number of acquisition and extinc- 
tion trials. J. exp. Psychol., 1954, 47, 
1-9. 

. Kent, G. H., & Rosanorr, A. J. A study 
of association in insanity. Amer. J. 
Insanity, 1910, 67, 37-96; 317-390. 

. Mirver, N. E. Learnable drives and 
rewards. In S. S. Stevens (Ed.), Hand- 
book of experimental psychology. New 
York: Wiley, 1951. Pp. 435-472. 

. Monwracus, E. K. The role of anxiety in 
serial rote learning. J. exp. Psychol, 
1953, 45, 91-98. 

. Mowrer, O. H. A _ stimulus-response 
analysis of anxiety and its role as a 
reinforcing agent. Psychol. Rev., 1939, 
46, 553-565. 

. Passey, G. E. The influence of intensity 
of unconditional stimulus upon acqui- 
sition of a conditional response. J. exp. 
Psychol., 1948, 38, 420-428. 

. Ramonp, C. K. Anxiety and task as de- 
terminers of verbal performance. J. exp. 
Psychol., 1953, 46, 120-124. 

. Stecat, P. S., & Brantiey, J. J. The 
relationship of emotionality to the 
consummatory response of eating. /. 
exp. Psychol., 1951, 42, 304-306, 

. Stecat, P. S., & Srecar, H. S. The effect 
of emotionality on the water intake of 
the rat. J. comp. physiol. Psychol., 1949, 
42, 12-16. 

. Spence, K. W. Learning and performance 
in eyelid conditioning as a function of the 
intensity of the UCS. J. exp. Psychol., 
1953, 45, 57-63. 

. Spence, K. W. Current interpretations of 
learning data and some recent develop- 
ments in stimulus-response theory. In 
Learning theory, personality theory, and 
clinical research, New York: Wiley, 
1954. Pp. 1-21. 


Annu. Rev. 





PAIRED-ASSOCIATES LEARNING 


19. Spence, K. W. Behavior theory and con- 
ditioning. New Haven: Yale Univer. 
Press, in press. 

20. Spence, K. W., & Farser, I. E. Condi- 
tioning and extinction as a function of 
anxiety. J. exp. Psychol., 1953, 45, 
116-119. 

21. Spence, K. W., Farper, I. E., & Tayior, 
E. ‘The relation of electric shock and 
anxiety to level of performance in eyelid 
conditioning. J. exp. Psychol., 1954, 48, 


404-408. 


305 


22. Spence, K. W., & Taytor, J. A. Anxiety 
and strength of the UCS as determiners 
of the amount of eyelid conditioning. J. 
exp. Prychol., 1951, 42, 183-188. 

23. Taytor, J. A. The relationship of anxiety 


to the conditioned eyelid response. /. 
exp. Prychol., 1951, 41, 81-92. 

24. Tayior, J. A., & Srence, K. W. The 
relationship of anxiety level to perform- 


J. exp. Psychol, 


ance in serial learning. 
1952, 44, 61-64. 


(Received October 29, 1955) 





Journal of Experimental Psychology 
Vol. $2, No. 5, 1956 


ANXIETY (DRIVE) LEVEL AND DEGREE OF 
COMPETITION IN PAIRED-ASSOCIATES 
LEARNING! 


K. W. SPENCE, JOHN TAYLOR,*? AND RHODA KETCHEL # 


State University of lowa 


In a recent study (5) two experi- 
ments were reported in which the 
learning of paired associates by Ss 
scoring at the high and low end of the 
Taylor A(nxiety) scale was compared. 
In the first experiment it was found 
that high-A Ss performed in a su- 
perior manner to low-A Ss in learning 
a list of paired associates in which 
competition among the responses was 
minimized. In the second experi- 
ment, on the other hand, the low-A 
Ss required fewer trials to learn a list 
in which competition among the 
responses was theoretically stronger 
than did the high-A Ss. The low-A 
Ss also made fewer errors than the 
high-A Ss although the difference in 
this measure was not significant. 

These experimental results were 
predicted on the basis of a theory that 
the A-scale reflects, in part, the 
general drive level (D) of S, and the 
implication of that theory of the 
multiplicative action of D on habit 
strength (/7) that the effect of vari- 
ation in the level of D upon perform- 
ance in this type of learning depends 
upon the position in the habit hier- 
archy of the response to be learned 
(1,4, 7). If the appropriate response 
is relatively strong in comparison 
with possible competing responses, it 
may be shown that the high-A Ss 


'This study was carried out as part of a 
project conducted under contract N9 onr-93802, 
Project 154-107 between the State University 
of Iowa and the Office of Naval Research. 

*Now at Human Research Unit No. 
CONARC, Fort Benning, Georgia. 

*Now at The RAND Corporation, Santa 
Monica, California. 


3 


306 


should do better than low-A Ss. On 
the other hand, if the appropriate 
response is initially lower in habit 
strength than competing responses, 
then the opposite finding would be 
expected, at least in the early stages 
of learning. 

In view of the fact that the second 
experiment mentioned above involved 
a rather small number of Ss, 10’in the 
high-A group and only 9 in the low-A 
group, it was decided to conduct a 
further test of the theory. The 
present experiment differs from the 
first in a number of respects, par- 
ticularly in that the two experiments 
were combined into a single one of the 
factorial design. ‘The purpose of this 
design was to test whether the inter- 
action between anxiety level and 
performance on the two types of lists, 
competitive and noncompetitive, 
would be significant. 


Metuop 


Subjects.—The Ss were 40 men and 40 women 
enrolled in an introductory psychology course, 
equal numbers of each sex having scored in 
either the upper 20% (anxious) or lower 20% 
(nonanxious) on the scale of manifest anxiety 
developed by Taylor (6). All were naive with 
respect to the experimental task, and had no 
previous experience with paired-associates learn- 
ing. Equal numbers of males and females 
(anxious and nonanxious) were assigned to the 
two learning lists. Inasmuch as two Fs ran the 
Ss, the groups were further subdivided so that 
each E ran half of the Ss in each category. Thus 
the present study was designed to investigate the 
joint effects of anxiety and type of list (com- 
petitive or noncompetitive) upon paired-as 
sociates learning, with appropriate controls for 
sex and £. 

Apparatus and materials.—Hull-type memory 





ANXIETY LEVEL AND PAIRED-ASSOCIATES LEARNING 


drums were employed to present the lists of 
paired associates. ‘The successive stimulus items 
were exposed every 4 sec. with a 4-sec. interval 
between the successive presentations of the list. 
In the case of the practice list there was a 2-sec. 
anticipation interval, while a 1.67-sec. antici- 
pation interval was employed in the case of the 
learning lists. The practice list (15 paired 
nouns) was used to acquaint Ss with the pro- 
cedure and to provide maximal and minimal 
performance criteria. 

Learning List I, the noncompetitive list, 
consisted of 14 paired adjectives. As in the 
case of the comparable list used in the first 
experiment of the previous study (5) this list 
was constructed in such a manner as to insure 
that the tendency to give the correct response 
in the case of each paired item would be rela- 
tively strong, whereas incorrect response ten- 
dencies would be weak or nonexistent. This 
was accomplished by pairing adjectives from 
Haagen’s word list (2) that were judged to have 
a close (high) associative connection. Incorrect, 
competing response tendencies were minimized 
in strength by (a) presenting the pairs in three 
different orders so as to prevent the development 
of remote associations, and (b) arranging for no 
formal similarities or similarity of meanings 
(synonymity) among either the stimulus or 
response items or among the nonpaired stimulus 
and response words. 

Learning List II, the competitive list, con- 
sisted of 10 paired taken from 
Haagen’s study (2). This list, shown in Table 
1, was constructed in the same manner as was 
described in the previous study. Four of the 
paired adjectives, marked by asterisks, had high 
associative connections between the stimulus 
and response items, while six of the stimulus 
words were paired with response words with 
which they had little or no associative con- 
nection. Since the stimulus words of these 


adjec tives 


TABLE 1 


Competitive List Il 





Stimulus Response 


Quiet 
*Little 
*Roving 

Serene 

Arid 
*Tranquil 

Petite 

Desert 

Migrant 
*Barren 


Double 
Minute 
Nomad 
Headstrong 
Grouchy 
Placid 
Yonder 
Leading 
Agile 
Fruitless 





* Highly associated word pairs, 


307 


latter pairs were synonymous with one or the 
other of the stimulus words making up the 
highly associated pairs, they also had relatively 
high associative connections with the response 
words of these latter pairs. Thus strong com- 
peting response tendencies were present in the 
case of these six stimulus words from the be- 
ginning of learning. Since the theoretical pre- 
dictions with respect to these two kinds of 
paired associates (those with high and those with 
low associative connections with the correct 
response words) were different, the data for 
them were treated separately. 

Procedure.—All Ss served individually under 
the same experimental conditions. Immediately 
following the reading of the instructions de- 
scribing the method of learning, each S received 
six trials on the practice list followed by a 2-min. 
rest period. During this rest period S was 
moved to a seat before the screen containing the 
drum with one or other of the learning lists. 
Following this interval S was given 20 trials on 
List I or 30 trials on List I]. The Ss who met 
the criterion of two successive perfect trials 
before 20 or 30 trials respectively were dis- 
continued on reaching the criterion. 

On each trial, correct anticipations, errors, 
and overt intrusions were recorded. An error 
consisted in either making no response or an 
incorrect response (overt intrusion) during the 
anticipation interval. 

Some Ss were discarded on the basis of their 
scores on the practice list if they failed to make a 
single correct response, or if they made 50 or 
more correct responses during the six practice 
trials. Four Ss were discarded on the basis of 
these criteria, one high-A S who made more than 
50 correct responses, two high-A Ss who made 
zero correct responses, and one low-A S who 
made zero correct responses. All four were 
replaced. 

In order to determine more precisely whether 
S anticipated the appearance of the response 
word in the aperture of the drum, S’s verbal 
response was recorded by means of a two-stylus 
polygraph. One stylus was activated by a 
microswitch which in turn was operated by a 
cam on the memory drum. This stylus pro- 
vided a record of the 1.67-sec. anticipation 
interval and the 2.33-sec. interval during which 
the response word was visible to S. The second 
stylus was activated by the amplified signal from 
a Turner 82-3H crystal microphone which rested 
on S's chest. Inspection of the polygraph 
record made it possible to determine whether 
S’s response had occurred within the anticipation 
interval. Thus systematic and variable errors 
in judgment of E with respect to the responses 
of S were minimized. 





K. W. SPENCE, JOHN TAYLOR, AND RHODA KETCHEL 


REsuULTs 


Comparison of the performance of 
the groups on the practice list indi- 
cated that the high- and low-A groups 
given List I were closely comparable 
to more extensive samples of Ss that 
have been used over the last three 
years on the same list. Thus the 
high-A group made 14.9 correct antici- 
pations on the practice list and the 
low-A Ss 13.3. These values are to 
be compared with a mean of 15.0 for 
a sample of 336 high-A Ss and a mean 
of 13.9 for a sample of 287 low-A Ss 
The practice list scores of the groups 
run on List Il were somewhat above 
average, being 17.25 for the high-A 
Ss and 16.45 for the low-A Ss. The 
relative performances of the two 
groups, however, were quite com- 
parable to those of the larger samples, 
the high-A group performing slightly 
better than the low-A groups. 

Table 2 presents data on learning 
in terms of the mean number of errors 
per word pair made in Trials 2—20 for 
List I and for the six paired items 
involving competition of List II. As 
may be seen the high-A Ss performed 
better on List I while the low-A Ss 
were slightly better on the competitive 
List II. As indicated in the intro- 
duction, the primary interest of this 
factorially designed study was cen- 
tered on the interaction between 
anxiety level and the two kinds of 
learning lists. An analysis of vari- 
ance of the error data in Table 2 gave 


TABLE 2 


Mean Numper or Errors rer Worp Pair ix 
Triats 2-20 








List I 























©-—-@ ASSOC. PAIRS -LOWA 
©—e ASSOC PRIRS - HIGH A 
@—-=© NOM ASSOC PAIRS~LOW A 
¢——* NON ASSOC PRIRS - HIGH A 


7” 
10 








Fic. 1. Paired-associates learning as a 
function of anxiety under conditions of high 
interpair competition. Word pairs of both high 
and low association values were interspersed 
within the same training list, but were analyzed 
separately. 


an F value for this interaction of 4.34, 
which for the appropriate degrees of 
freedom was significant at between 
the .O1 and .05 levels. 

Turning to the results for the 
separate lists, it is found that, as in 
the case of the previous study, the 
high-A Ss performed significantly 
better than the low-A Ss on the non- 
competitive list (P = < .05). The 
results for the competitive list also 
duplicated the findings of the earlier 
study. On this list the low-A Ss 
made the smaller number of errors, 
but again the difference was not a 
significant one. Exactly the same 
findings were obtained for error meas- 
ures based on all 10 word pairs of 
List II. 

Learning curves for the high and 
low groups on the two types of word 
pairs in List II are presented in Fig. 
1. The two lower curves represent 
the performance in the six nonasso- 
ciated word pairs that had strong 
competing responses, while the two 
upper curves are for the four pairs in 





ANXIETY LEVEL AND PAIRED-ASSOCIATES LEARNING 


which the words were initially highly 
connected. The picture presented 
here is almost an exact duplicate of 
the findings of the earlier study (5) 
and also those of a somewhat different 
type of verbal learning situation em- 
ployed by Ramond (3). In all three 
instances the low-A Ss_ performed 
better than the high-A Ss on the word 
pairs involving strong competition. 
Similarly all three studies showed 
the same pattern in the case of the 
word pairs involving initially high 
associative connections. After a brief 
period of initial superiority on the 
part of the high-A Ss, the performance 
curves crossed and the high-A Ss were 
subsequently poorer than the low-A 
Ss. While the differences between 
the two groups at the different stages 
of learning were not statistically 
reliable, the consistency with which 
the rather intricate pattern of results 
has repeated itself in all three studies 
suggests that the 


phenomenon is 
genuine and not just a chance one. 


Discussion 


The significant interaction between 
anxiety level and type of paired-asso- 
ciates item obtained in the present 
experiment indicates again that high- 
A and low-A Ss perform differently in 
learning situations depending upon the 
extent to which strong incorrect re- 
sponses are in competition with the 
correct, appropriate response. In ac- 
cord with the interpretation that 
anxiety level reflects, in part, the level of 
general drive (D) of an S, it should be 
expected that higher D levels would 
produce superior performance in situ- 
ations in which the habit strength of the 
correct response is relatively strong as 
compared with those of any competing 
responses. Since this was the kind of 
situation provided in the word pairs of 
List I, the finding that high-A Ss were 
superior to low-A Ss is in agreement with 
the theoretical analysis. 


309 


Under conditions in which the habit 
strength of the correct response is weaker 
than one or more competing responses 
a higher D level would be expected t 
result in poorer performance. This im 
plication follows from the assumption 
that D will multiply the habit strengths 
of both the correct and incorrect re- 
sponses, thus increasing the amount by 
which the excitatory strength (£) of any 
stronger competing response will exceed 
that of the correct response. Since per 
formance is assumed to be a function of 
the magnitude of the difference between 
the excitatory potentials of the correct 
and incorrect responses, it is obvious 
that the higher the level of D the greater 
will be the advantage of the incorrect 
responses and hence the greater the 
likelihood of the occurrence of 
erroneous responses. 

List II was constructed in a manner 
calculated to produce a learning situ 
ation in which, in the case of 6 of the 10 
word pairs, the habit strengths of the 
correct responses would initially be 
weaker than certain competing, incorrect 
responses that were among those to be 
given in learning the list. As predicted, 
the high-A Ss performed more poorly 
than did the low-A Ss on these items 
although the difference 
nificant. 

The findings obtained with the four 
word pairs of List II that had high 
initial associative connections have been 
interpreted in terms of generalized 
interference from the later learning of 
the correct responses in the other six 
word pairs (3, 4). Whether or not this 
is the correct interpretation, the phe- 
nomenon of reversal of the high- and 
low-A groups on such initially strongly 
associated word pairs embedded in such 
situations has been 
separate experiments. 


such 


was not sig 


obtained in three 


SUMMARY 


The present experiment compared the per 
formances of Ss scoring at the high and low ends 
of the Taylor manifest anxiety scale in paired- 
associates learning involving different degrees of 


competition. A_ significant interaction (P 





310 


< 05) was found between anxiety level and 
type of paired-associates item, high-anxious Ss 
being superior to low-anxious Ss when the 
learning involved a minimum of competition 
and the low-A Ss being somewhat better than 
high-A Ss (but not significantly so) when the 
learning involved strong competing response 
tendencies, 


REFERENCES 


1. Fanwer, I. E., & Spence, K. W. Complex 
learning and conditioning as a function of 


anxiety. J. exp. Psychol., 1953, 45, 
120-125. 
2. Haacen, C. H. Synonymity, vividness, 


familiarity, and association value ratings 
of 400 pairs of common adjectives. /. 


Psychol., 1949, 27, 453-463. 








K. W. SPENCE, JOHN TAYLOR, AND RHODA KETCHEL 


3. Ramonp, C. K. 
miners of verbal performance. 
Psychol., 1953, 46, 120-124. 

4. Spence, K. W. Behavior theory and con- 
ditioning. New Haven: Yale Univer. 
Press, 1956. 

5. Spence, K. W., Farper, I. E., & McFann, 
H. H. The relation of anxiety (drive) 
level to performance in competitional and 
noncompetitional paired-associates learn- 
ing. J. exp. Psychol., 1956, $2, 296-305. 

6. Taytor, J. A. The relationship of anxiety 
to the conditioned eyelid response. J. 
exp. Psychol., 1951, 41, 81-92. 

7. Tayior, J. A., & Spence, K. W. The rela- 
tionship of anxiety level to performance 
in serial learning. J. exp. Psychol., 1952, 
44, 61-64. 


Anxiety and task as deter- 


J. exp. 


(Received for early publication June 25, 1956) 





Journal of Experimental Psychology 
Vol. 52, No. 5, 1956 


LEARNING AND EXTINCTION BASED UPON FRUSTRATION, 
FOOD REWARD, AND EXPLORATORY TENDENCY 


HARVEY M. ADELMAN AND JACK L, MAATSCH 


Michigan State University * 


In a recent study (1) designed to 
test some implications of an inter- 
ference theory of inhibition (6), the 
authors demonstrated that the type 
of response elicited by frustration 
stimulation resulting from the omis- 
sion of reward is a significant factor 
in determining resistance to extinction 
of the original learning. The present 
study is designed to investigate the 
learning and extinction of a habit 
based upon frustration stimulation. 

The primary methodological prob- 
lem involved in the demonstration of 
learning based upon the frustrating 
omission of reward is the necessity of 
isolating the new response to be 
learned from the attenuating influence 


of the original learning required to 


produce frustration initially. How- 
ever, if this isolation could be ac- 
complished in a simple situation, it 
would be possible to subject learning 
based upon frustration to experi- 
mental analysis. 

For example, it would be possible 
to compare rates of learning and 
extinction of a habit based upon 
frustration, with the same _ habit 
based upon more conventional rein- 
forcers, such as food reward and 
shock. It could also make possible 
a quantification of the relationship 
between the magnitude of frustration 
stimulation generated and some learn- 
ing variables, e.g., s/lx, sEr, D, K, 
and others. Knowledge of these re- 
lationships would constitute a step in 
the direction of clarifying the inter- 
action of the original and the new 


1 Both authors are now at RAND Cor- 
poration, Santa Monica, California. 


311 


learning produced by experimental 
extinction procedures. 

Such a situation was created in the 
first study of this series (1). The 
original habit involved an “approach 
to food” response in a straight alley 
maze. During the regular extinction 
procedure, i.e., removal of reward, one 
group was required to “jump-out” 
of the goal box after experiencing the 
frustrating omission of reward. ‘The 
specification of this jump-out response 
to frustration stimulation resulted in 
little or no extinction of the original 
approach response. In other words, 
during “extinction” of the original 
response, Ss continued to run down 
the alley at maximal speeds and then 
jump out of the goal box 

Since the “jump-out” response to 
the goal box represents a relatively 
discrete learning problem, it is pos- 
sible to study the learning and ex- 
tinction of this response under several 
different reinforcing conditions. 
Thus, the present study will compare 
learning and extinction of the jump- 
out response when based upon frus- 
tration, food reward, and innate 
exploratory and/or escape from con- 
finement tendency produced by the 
learning situation. 


Mernop 


Apparatus.—The apparatus was a conven 
tional straight-alley maze. The starting-box 
section was 11 in. square, painted flat gray, and 
covered with a piece of }-in. clear plate glass 
The 18 in. long, 54 in. wide, and 8&4 in. high 
runway was of natural plywood color and cov- 
ered by }-in. hardware cloth. The goal-box 
section was I1 in. square and 10 in. high, 
painted black, and covered with a piece of }- 








312 


in. clear plate glass. A natural plywood guil- 
lotine door separated the goal box from the 
straight alley and a semicircular piece of black 
Bristol board (11-in, radius) was mounted on top 
of the goal box on the side facing the alley to 
prevent Ss viewing the rest of maze when 
perched on top of the goal box. A 2-in. black 
ledge was attached to the top external part of 
the goal box on the three remaining sides to 
facilitate perching after S had jumped from the 
goal box. 

Subjects. —The Ss were 30 experimentally 
naive female hooded rats, 90-150 days old, from 
the colony maintained by the Psychology Animal 
Laboratory at Michigan State College. 

Procedure.—The Ss were handled for seven 
days prior to introduction into the maze. 
During this time they were put on a 23-hr. food 
deprivation schedule and received an average 
of 9 gm. of Purina Dog Chow checkers daily at 
the scheduled training time. Throughout the 
course of the experiment, all Ss were individually 
fed 9 gm. 10 min. after the end of the daily run. 
On Day 8, the 10 Ss in a frustration group were 
introduced into the maze and allowed free ex- 
ploration for a 1-hr. period. On Day 9, acqui- 
sition trials began. The Ss were given three 
spaced trials (10-min. intertrial interval) on the 
the first day, four spaced trials on the second day, 
and six spaced trials per day for five days there- 
after. A 20-sec, period after entering the goal 
box was allowed for eating a }-gm. reward pellet. 
After the 20-sec, interval Ss were removed to 
individual running cages to await the next trial. 
Thus, each S in this group received 37 spaced 
learning trials in the straight alley. 

On the following day extinction procedures 
began. The plate-glass cover was removed to 
allow access to the ledge on top of the goal box, 
and the reward pellet was removed from the 
goal box. After S had entered the goal box, it 
was allowed a 5-min. period in which to experi- 
ence nonreward and to escape by jumping to the 
ledge. After jumping, S remained on the ledge 
for 20 sec. before being returned to the running 
cages. If S did not jump within the 5-min, 
period, he was aided in climbing to the ledge 
by £ inserting a hand into the box to serve as 
a step. 

The Ss in a food-reward group were not run 
in the straight-alley portion of the maze. The 
10 Ss were confined in the goal box for a 20-sec. 
period for each of the 37 learning trials received 
by the frustration group but were given no food 
reward. ‘These trials were spaced exactly as 
were the trials for the frustration group in order 
to control for experience with the goal box. 
When the frustration group began the extinction 
procedure of jumping from the goal box the 
food-reward group was taught the identical 





HARVEY M. ADELMAN AND JACK L. MAATSCH 


response, by rewarding them with a }-gm. pellet 
placed on the ledge. Each S was allowed 20 sec. 
on the ledge to consume the pellet before being 
returned to the running cages. 

To preclude the possibility that the obtained 
results could have been due to exploratory 
tendencies, a control group of 10 Ss was utilized. 
This group underwent treatment identical! to the 
food group but were not rewarded for the jump- 
out response. In such a manner it was possible 
to determine whether the jump-out response 
could be learned on the basis of novel stimulation 
as well as through frustration and food reward. 

The acquisition period for the learning of the 
jump-out response covered three days of 10 
spaced (10-min. intertrial interval) trials per 
day. The frustration group followed the ex- 
tinction procedure of traversing the straight 
alley, experiencing nonreward, and jumping to 
the ledge, while the food-reward and control 
groups were put directly into the goal box and 
allowed to jump to the ledge. No food was 
present in the goal box for any of the three 
groups. The frustration group learned the 
jump response to the frustrating omission of 
food, the food-reward group learned to jump to 
food, and the control group jumped to escape 
confinement and/or explore novel stimuli. 

On the day following the 30-trial “jump-out” 
acquisition period, all groups underwent the 
jump-extinction procedure. The procedure for 
all groups was to place each S in the goal box 
and await the jump. When S had jumped and 
perched on the ledge for a period of 20 sec., he 
was picked up and put back into the box. None 
of the groups was rewarded for jumping. Each 
S was handled in such a manner until he re- 
mained in the box for a 5-min. no-jump period, 
or had gone through 100 jump trials* The Ss 
were treated singly and continuously until one 
of the two criteria had been met. 


RESULTS 


The learning and extinction of the 
jump-out response under the three 
different conditions is presented in 
Fig. 1. .The learning of this response 
when elicited by frustration resulting 
from omission of reward proceeds at 


* It was generally agreed, prior to the experi- 
ment, that 100 massed extinction responses 
showing no sign of extinction, i.e., progressive 
increases in latency of response, would con- 
stitute a confirmation of a prediction of “no 
extinction” or “fixation” of the response 
tendency. 























- ACQ EXTINCTION 
S 300 pce: cere guessehenlns 
~ nq y, 
= 250}, | | / 
\ , , 
Z 200} ' fw \ J 
wi 150} \/ / ae 
& 100} , ~-CONTROL 
3 pe ee Er. ---FOOD REWARD 
z \ ; —FRUSTRATION 
FS Se 
= BLOCKS OF 5 ACQUISITION TRIALS 


AND 20 EXTINCTION TRIALS 


A comparison of rates of learning and 


Fic. 1. 
extinction of a “jump-out” habit based upon 
frustration stimulation, food reward, and ex- 
ploratory tendency. 


a relatively rapid rate—more rapidly, 
in fact, than learning based upon 
food reward for hungry Ss. 4 Learning 
based upon exploratory tendency of 
unrewarded hungry Ss is highly 
variable and unstable. The means 
of the median latencies for blocks of 
5 trials for the frustration, food, and 
control group were 4.9, 19.6, and 168.4 
sec., respectively. Application of 
Festinger’s nonparametric d test (2) 
to the data pooled in this manner 
yielded ds between all groups that 
were significant beyond the .01 level. 
Evidence to support the assertion 
that the jump-out response was 
elicited by the frustrating omission 
of reward may be found by analysis 
of the first jump trial. Eight Ss in 
the control group and 7 Ss in the food 
group but only 3 Ss in the frustration 
group failed to jump spontaneously 
within the 5-min. period. The differ- 
ence between the frustration group 
and the other groups combined is 
significant beyond the .O1 level (¢t 
= 2.57, df = 28). It would seem 
that frustration stimulation tended 
to elicit the jump-out response more 
frequently than did the exploratory 
and/or escape-from-confinement stim- 
ulation in the other two groups. 
The jump-out extinction data are 
also presented in Fig. 1. The ex- 
tinction data for the three groups are 


LEARNING AND EXTINCTION 


313 


clearly different. The frustration 
group shows little or no extinction of 
the jump-out response after 100 
massed evocations. The Ss in the 
control group all reached the extinc- 
tion criterion before or within the 
first 20 extinction trials. The food- 
reward group, on the other hand, 
exhibits a conventional extinction 
curve. The means of the median 
latencies for the 100 extinction trials 
for the three groups were 4.8, 271.0, 
and 187.5 sec., respectively. The 
ds for all group comparisons were 
significant beyond the .01 level (2). 

Further evidence for the differences 
in the behavior of the three groups 
during extinction is found in the 
analysis of the data in terms of the 
fixation criterion. Seven of the 10 
Ss in the frustration group met the 
fixation criterion whereas none of the 
Ss in the other groups met this 
criterion. In addition to meeting the 
fixation criterion, the 7 Ss of the 
frustration group manifested other 
symptoms commonly associated with 
fixated behavior, that is, their be- 
havior was unemotional and highly 
stereotyped, almost to the point of 
being “‘mechanical.” 


Discussion 


In general, the results tend to confirm 
the position that habits based upon 
frustration stimulation are learned more 
rapidly than a comparable habit based 
upon either food reward for hungy Ss or 
exploratory (control group). 
Moreover, Ss trained to jump under 
frustration stimulation tend to exhibit 
a significantly resistance to 
extinction than Ss trained to jump to 
food or novel stimuli. 

These results were predicted in ad- 
vance from several considerations of the 
interference theory. First of all the 
theory contains no concept of reactive 
inhibition (J/g). This variable has been 
shown to be an irrelevant factor in the 


tendency 


greater 

















314 


extinction process within the limits of 
work usually required in a conventional 
learning study (5,6). According to the 
present theory, within the limits of 
physiological exhaustion, a habit will not 
be extinguished unless the response 
terminates in a situation providing a 
source for new learning which can inter- 
fere with the original habit. To assert 
such a basis for new learning requires the 
demonstration that some stimulus or 
stimulus complex in that situation is 
eliciting a member of some characteristic 
(identifiable) response class. This re- 
sponse class in turn must be incom- 
patible with the original habit before 
“extinction” will occur. 

Applying this approach to the extinc- 
tion data, we find that the extinction 
procedure of simply placing Ss back into 
the goal box did not interfere with the 
“jump-out” habit based upon frustra- 
tion. As a consequence, this group 
showed little or no extinction with 
massed evocation of the habit, i.e., 
fixation of the jump occurred. In the 
food-reward group, however, the ex- 
tinction procedure resulted in omission 
of food where previously expected. The 
frustration stimulation resulting from 
the omission of food reward elicited 
emotional and avoidance responses which 
reinforced a habit involving responses to 
frustration which in turn interfered with 
the original learning. As expected, the 
jump-out response for this group under- 
went typical extinction. The interfering 
learning which produced extinction in 
the food-reward group is analogous to 
the original learning of the jump-out 
response by the frustration group. The 
extinction of the jump habit in the 
control group may be accounted for by 
the loss of the capacity of novel stimuli 
to elicit the exploratory jump-out re- 
sponse because of stimulus satiation or 
familiarity (3, 8). 

Since both the frustration and food- 
reward groups received massed evocation 
(20-sec. intertrial interval) of the jump 
response during the extinction procedure 
and yet discretely different results were 
obtained, the results stand in contra- 


HARVEY M. ADELMAN AND JACK L. MAATSCH 


diction to “explanations” derived from 
Hull’s interpretation of extinction phe- 
nomena (4). It is difficult to understand 
how Jpg could be responsible for pro- 
ducing the extinction effects usually 
attributed to it and yet fail to produce 
extinction of the habit based upon 
frustration even after 100 massed 10-in. 
jumps. The amount of work involved 
would seem to be greater than the 
amounts of work involved in most infra- 
human learning experiments. 

It is also interesting to note that the 
relatively faster learning produced by 
frustration is consistent with the general 
observation that extinction of a habit 
based upon 100% food reward requires 
fewer trials than are required by the 
original learning. Thus one of the 
many possible factors that produce 
relatively rapid rates of extinction may 
be simply: Interfering habits elicited by 
frustration stimulation are learned faster 
than an original habit based upon food 
reward. For a discussion of the im- 
portance of the disparity between the 
rate of learning and extinction for 
theories of inhibition see (9). 


SUMMARY 


The present study was designed to compare 
the learning and extinction of a habit based 
upon frustration stimulation, food reward, and 
exploratory tendency in the same situation. 
Three groups of Ss were trained to jump 10 in. 
to the top of a goal box under three types of 
reinforcing situations and then extinguished. 
The results demonstrate that the “jump-out” 
habit based upon frustration stimulation was 
learned faster and proved to be far more re- 
sistant to extinction than the same habit based 
upon food reward or exploratory tendency. 
The results were predicted by an interference 
theory of inhibition and seem to be inconsistent 
with explanations stemming from Hull’s treat- 
ment of extinction phenomena. 


REFERENCES 


1. Apetman, H. M., & Maarscu, J. L. Resistance 
to extinction as a function of the type of 
response elicited by frustration. /. exp. 
Psychol., 1955, 50, 61-65. 

2. Festincer, L. The significance of difference 
between means without reference to the 














frequency distribution function. Psycho- 
metrika, 1946, 2, 97-105. 
3. Granzer, M. Stimulus satiacion: an ex- 


planation of spontaneous alternation and 
related phenomena. Psychol. Rev., 1953, 
60, 257-269. 

4. Hur, C. L. A behavior system. 
Haven: Yale Univer. Press, 1952. 

5. Maartscn, J. L., Aperman, H. M., & Denny, 
M. R. Effort and resistance to extinc- 
tion of the bar-pressing response. /. 
comd. physiol. Prychol., 1954, 47, 47-50. 

6. Maatscn, J. L. Reinforcement and extinc- 
tion phenomena. Psychol. Rev., 1954, 
61, 111-118. 


New 





LEARNING AND EXTINCTION 


315 


7. Maatscu, J. L. An experimental test of the 
differential effects of work and frustration 
upon learning. Unpublished doctor's 
dissertation, Michigan State College, 1955. 

8. Montcomery, K. C. “Spontaneous alter- 
nation” as a function of the time between 
trials and amount of work. /. exp. Pry- 
chol., 1954, 42, 82-93. 

9. Spence, K. W. Theoretical interpretations 
of learning. In S. S. Stevens (Ed.), 
Handbook of experimental psychology. 
New York: Wiley, 1951. Pp, 690-729, 


(Received October 19, 1955) 








Journal of Experimental Psychology 
Vol. 52, No. 5, 1956 


CONCEPT IDENTIFICATION AS A FUNCTION OF TASK 
COMPLEXITY AND DISTRIBUTION OF PRACTICE 
FREDERICK G. BROWN! AND E. JAMES ARCHER 


University of Wisconsin 


This is the second in a series of 
studies on concept identification. In 
the first Archer, Bourne, and Brown 
(1) varied complexity by presenting 
increasing amounts of irrelevant in- 
formation along with the relevant 
information in each stimulus pattern. 
Defining their units in terms of 
information theory (9), they were 
able to vary complexity quantita- 
tively along a single dimension. They 
predicted, on the basis of the number 
of possible hypotheses that S might 
formulate, that time to solution would 
be a positively accelerated exponential 
function of the amount of irrelevant 
information. The experimental re- 
sults supported this prediction. 
Thus, varying complexity by altering 
only one aspect of the stimulus 
situation, the amount of irrelevant 
information, they were able to vary 
task difficulty. 

Two early studies of problem 
solving (2, 3) found massed practice 
superior to distributed, especially 
during the earlier stages of the experi- 
ment. Recently two concept for- 
mation studies (10, 11) considered 
the effect of short intertrial rest 
intervals (60 .sec. or less) on the 
learning of concepts. Neither found 
any statistically significant effects 
produced by distribution of practice. 

There have been no studies directly 
concerned with the interaction of task 
complexity and distribution of prac- 
tice. Garrett (5) has suggested that 
distributed practice is advantageous 


This research was done while the first author, 
now at the University of Minnesota, was a 
National Science Foundation Fellow in Psy- 
chology. 


with easier tasks and that massed 
practice is better with harder tasks. 
Hull’s theory (8), though not dealing 
directly with concept formation, 
would probably predict the opposite 
results: distributed practice would 
lead to facilitation in learning with 
the more difficult (effortful) tasks. 
The purposes of this study were to 
(a) further explore the effect of vari- 
ations in task complexity, (b) study 
the effect of distribution of practice, 
and (c) test for the interaction of these 
two independent variables. Since 
performance in concept formation 
has varied as a function of the con- 
cepts to be learned (6, 7), the prob- 
lems in the present study were made 
orthogonal to distribution of practice 
and task complexity so that their 
differences could be isolated. 


PROCEDURE 


Subjects. —The Ss were 120 University of 
Wisconsin students. Each was assigned ran- 
domly to one of 12 treatment combinations and 
served individually for one session of 30-45 min. 

Task.—The task for S had been described in 
detail previously (1). Essentially S was pre- 
sented with geometric patterns and pressed one 
of four buttons to identify the correct category 
(concept). The four buttons corresponded to 
the four possible combinations of the two levels 
of the two relevent dimensions. A self-paced 
correction procedure was used and accuracy, as 
opposed to speed, was stressed. 

A trial was defined as the presentation of 16 
patterns. Although each of the four buttons 
was correct four times during a trial, the se- 
quence of correct responses was random, A 
blank frame on the strip film indicated the end 
of a trial. At the end of a trial S had one of 
three rest intervals. For the massed conditions 
the rest interval was approximately 3 sec., the 
time it took E to reset the timing apparatus and 
advance the strip-film projector to the first 


316 





CONCEPT IDENTIFICATION 


pattern of the new trial. The distributed groups 
had either a 30- or ()-sec. intertrial rest interval. 
During this interval S named the suit and 
denomination of playing cards in an ordinary 
bridge deck. These stimuli were randomly 
presented on the same screen as the experimental 
patterns by a second strip-film projector. Six- 
teen trials of 16 patterns each, a total of 256 
patterns, constituted the experimental session.? 

Apparatus.—For S the apparatus consisted 
of two parts: a milk-glass screen and a control 
panel. The glass screen (74 in. K 114 in.) was 
mounted at about eye level in an opaque screen 
(33 in. X 34 in.) which prevented S from seeing 
E and the rest of the apparatus except the control 
panel. The stimuli were presented by a Dun- 
ning Animatic 16-mm. strip-film projector. 
When projected on the screen the large patterns 
were 1} in. on a side and the small ones } in. ona 
side. The S was seated about 2 ft. from the 
screen. 

Each of the four buttons on the control panel 
was connected to a light on a panel in front of £. 
These lights signaled to E which button S had 
pressed. If S pressed an incorrect button, £ did 
nothing except make a record of the response; 
if S pressed the correct button, E activated the 
frame-by-frame advancing device on the strip- 
film projector and a new pattern was presented. 
At the end of a trial E reversed the setting on a 
two-way switch which turned off one projector 
and turned on a second one. The second pro- 
jected the pictures of playing cards onto the 
screen at an automatically paced rate, 2 sec. per 
card. At the end of the rest interval, E reset 
the switch and S returned to identifying the 
geometric patterns. 

A Standard Electric time clock (.1-sec. scale) 
was used to record the time S took to complete 
each trial. The £ recorded all responses and 
the time per trial 

Lists.—A total of 20 lists was used—five at 
each level of irrelevant information. However, 
there were only five basic problems: a problem 
was defined as the four combinations of two 

?In a pilot study a criterion of two con 
secutively perfect trials was used. ‘The number 
of Ss who were unable to attain this criterion was 
so great as to make the design impractical. 
Since for the same number of trials the dis- 
tributed practice sessions were longer, more Ss 
in these groups declined to continue in the ex- 
periment until they attained the criterion. This 
biased selection of data made it appear as if 
distributed practice was facilitating since only 
the faster Ss could attain the criterion before 
they became tired of serving in the experiment. 
Many Ss served for more than an hour and were 
still operating only slightly better than chance. 


317 


particular bileveled dimensions. By definition 
these dimensions were relevant as contrasted 
with all other information presented to S. The 
problems were obtained by random selection 
from eight possible dimensions: form (square 
or triangle), size (large or small), color (red or 
green), shade (light or dark), number (one or 
two), horizontal (left or right), vertical (top or 
bottom), and orientation (upright or tilted), 
The five problems selected were: (A) number- 
horizontal, (B) shade-vertical, (C) size-number, 
(D) color-orientation, and (E) form-color. 
These five combinations of relevant dimensions 
(problems) were orthogonal to the four levels of 
irrelevant information, 

Since each stimulus could appear at one of 
two levels of a dimension, the amount of in- 
formation contained in the stimulus could be 
quantified in bits of information by finding logs 
of the number of equally probable alternative 
stimuli in the series being presented. If in 
Problem (A) no information were irrelevant, 
all of the patterns presented might be large, 
upright, dark red squares appearing in the lower 
half of the screen with but four variations; there 
would be one or two of them and they would 
appear on the left or right side of the screen, 
Irrelevant information could be added by intro- 
ducing variations in any or all of the other 
dimensions. Using the above example, 1 bit of 
irrelevant information could be added by making 
half of the squares large and half of them small, 

Irrelevant information was added by ran- 
domly selecting the necessary number of di- 
mensions from those which were not relevant. 
At each level all possible patterns were used. 
If a dimension was neither relevant nor irrele- 
vant, it appeared at only one level which was 
also randomly selected. 

Design.—There were 12 major groups cor- 
responding to the combinations of distribution 
of practice (0-, 30-, or (&-sec. intertrial rest) 
and task complexity (0, 2, 4, or 6 bits of irrele- 
vant information) conditions. ‘Ten Ss served 
in each group, a total of 120 in the experiment. 
Within each group two Ss learned each of the 
five problems. The dependent variables were 
correct responses, errors, and time per trial. 


ResuLTs 


Comparability of groups.-The de- 
sign prevented any statistical test of 
the initial equality of the 12 experi- 
mental groups. Initial equality was 
assumed only because Ss were ran- 
domly assigned to the experimental 


conditions. However, as no direct 





318 


test was made, the significance of all 
conclusions depends on the validity 
of the assumption of initial equality 
of the groups. 

Correct responses.-A correct re- 
sponse was recorded when S’s first 
response to any stimulus was correct. 
Since a correction procedure and a 
specific number of patterns was used, 
it was meaningless to count a correct 
response every time S pressed the 
correct button. The trial-by-trial 
performance under the four irrelevant 
information conditions is plotted in 
Fig. 1. The data are summed across 
the conditions of distribution of 
practice (Rests) and Problems since 
neither of these effects was significant. 

A repeated measures analysis of 
variance was performed using the 
number of correct responses per trial 
as scores. This analysis is sum- 
marized in Table 1. Due to signifi- 
cant heterogeneity of variance (chi 
square = 70,19, df= 11, P< Ol) 
the .O1 level of significance was used. 
Two main effects were significant: 





MEAN CORRECT 





TRIALS 
Fie. 1. 


Mean correct responses as a function of 
amount of irrelevant information. 


FREDERICK G. BROWN AND E. JAMES ARCHER 


TABLE 1 


Anatysis oF Variance or Correct 
Responses per Triat 








_— — | d 


ae 

)| (4805.307)! (52.16**) 

13932211 | 151.24*° 
469.063 | 5.09 


Irrel. Informatio~ ( 
linear 
quadratic 
cubic 
Rests 
Problems 
Ir. Inf. K Rests | 
Ir. Inf. K Prob. | 
Rests X Prob- 
lems * 
Ir. Inf. K Prob. | 
* Rests | 24 | 
| 
| 


14.648 
66.569 | 
203.752 | 


| 2.21 
| 117.870 | 
| 


1,28 
1.44 


1.20 


132.854 
110.917 | 


76.659 | 
} 


Residual Between 
Ss | @) 

Trials 
Trials Ir. Inf. 
Trials K Rests 
Trials & Prob. 

Trials X Ir. Inf. | 


92.123 | 
15| 353.775 
45| 16.368 | 
30 | 9.094 | 
@| 4.183 | 


68.22°* 
3.16°* 
1.75* 


x Rests 
Trials X Ir. Inf. 
x Problems 
Trials XK Prob. | 
x Rests 
Trials X Rests | 
x Prob. x 
Ir. Inf. 360 | 
Residual Within Ss} 900) 
Total 1919 | 


eo 
180 


6.233 1.20 


5.385} 1.04 


120 | 7.496 | 





*P = Ol level 
** P = OO1 level, 


Irrelevant Information (F = 52.16, 
= 3 and 60, P < .0O1) and Trials 
‘= 68.22, df=15 and 900, 

P < Ol). The length of the inter- 

trial rest interval (Rests) was not 

significant. The significant Irrele- 
vant Information term indicated that 
the number of correct responses was 
an inverse function of task com- 
plexity, i.e., more correct responses 
were made to the problems having 
less irrelevant information. The 

means were 217.80, 165.93, 123.63, 

and 103.40 correct responses for 0, 2, 

4, and 6 bits of irrelevant information 

respectively. The maximum number 

of correct responses possible was 256; 

since the chance probability of being 

correct was .25 the chance level was 





CONCEPT IDENTIFICATION 


64 correct. An orthogonal poly- 
nomial analysis (4) showed only the 
linear component to be significant 
(F = 151.24, df= 1 and 60, 
P < O01). This means that the 
function relating correct responses to 
irrelevant information can be best 
fitted by a straight line. The sig- 
nificant F ratio associated with Trials 
indicated that learning occurred with 
practice. 

The Trials X Irrelevant Informa- 
tion interaction was very significant 
(F=3.16, df=45 and 900, 
P < .0O1), indicating a difference 
between the slopes of the curves 
shown in Fig. 1. The Trials K Rests 
interaction was significant (F = 1.75, 
df = 30 and 900, P < .01), indicating 
that performance at various stages of 
practice was differentially affected by 
the distribution conditions. A triple 
interaction, Trials K Rests X Prob- 
lems, was very significant but its 
meaning is unclear. 


Errors.—An error was defined as the pressing 
of an incorrect button. As a correction pro- 
cedure was used, it was possible for S to make 
more than one error per pattern. A Pearsonian 
product-moment correlation coefficient was com- 
puted between total correct responses and total 
errors. The obtained r =—.960 (df = 118, 
P < Ol). In view of the high correlation the 
only analysis performed was a repeated measures 
analysis of variance using total errors per trial 
as scores. The major difference between this 
analysis and the previous one of correct re- 
sponses was that the term of Problems was 
significant (F = 3.68, df = 4 and 60, P < Ol). 
The meaning of this term will be discussed later. 

Time.—A third dependent variable was time 
per trial. However, as the instructions were 
designed to stress accuracy, not speed, and the 
experiment continued for a specific number of 
trials, time seemed to be a less meaningful 
measure than correct responses and errors. An 
analysis of variance was performed on total time, 
summing over trials. The only source of vari- 
ation found significant was Irrelevant Infor- 
mation (F = 33.14, df = 3 and 0&0), P < .OOl). 
As might be predicted, the time scores were 
proportional to complexity, i.e., the means were 
8.69, 13.09, 17.92, and 19.44 min. for 0, 2, 4, and 


319 


6 bits irrelevant, respectively. These time 
measures are the actual working time, of course, 
and do not include the interpolated rests. 
Correct solutions.—Another measure that 
could be obtained from the data was the number 
of Ss solving the problems. No statistical tests 
were performed on these data due to the small 
frequencies involved. Inspection of the data 
shows that the number of correct solutions, here 
defined as 30 or more correct out of the last 32 
patterns, decreases with increasing amounts of 
irrelevant information and tends to increase with 
increasing lengths of the intertrial rest interval. 


Discussion 


The results of the present experiment 
lend further support to the hypothesis 
that task difficulty in a concept identi- 
fication situation can be quantitatively 
controlled by varying the amount of 
irrelevant information in the stimuli, 
Goodness of performance varied as an 
inverse linear function of the amount of 
irrelevant information. Since the range 
of task complexity was greater than in 
previous studies, the results extend the 
generality of this method of varying 
complexity. Although the use of or- 
thogonal polynomials indicated that the 
functions relating performance (correct 
responses, errors) to task complexity 
were not significantly nonlinear, these 
findings do not contradict those of the 
previous study (1). A graphic extra- 
polation of the learning curves suggests 
that trials to solution would be a posi 
tively accelerated function of the amount 
of irrelevant information. Since the 
present study did not require Ss to 
attain the high criterion of the previous 
study, it is not surprising that the per- 
formance-complexity functions are not 
the same. 

In addition to the sources already 
mentioned, the term associated with the 
particular pairs of relevant dimensions, 
Problems, was significant in the error 
analysis. Tukey's test for significant 
gaps (12) performed on the mean error 


scores showed three distinct groups in 
decreasing order of difficulty: Problem 
B; Problem A; and Problems C, D, and 


Ek. By analyzing the problems into 
relevant dimensions it was found that 





320 


each of the more difficult problems (A, 
B) contained one “positional” dimension, 
i.e., one referring to the position of the 
pattern on the screen and one associated 
with the stimulus pattern per se. This 
finding suggests that there may be some- 
thing about the “positional” dimensions 
which makes them less available to § for 
solving the problems. Furthermore, 
shade seems to be more difficult than the 
other dimensions associated with the 
pattern proper. When it is combined 
with a positional dimension, the resulting 
problem (B) is significantly more difficult 
than the combination of a positional 
dimension and any other dimension 
associated with the stimulus pattern 
(Problem A). 

Distribution of practice as a main 
effect was consistently nonsignificant. 
However, the primary purpose of the 
present experiment was to test the effects 
of distribution of practice in a concept 
identification situation in which different 
levels of complexity were involved. In 
all analyses the interaction of distri- 
bution of practice and task complexity 
was not significant. Clearly, there was 
no differential effect of distribution of 
practice as a function of task complexity. 

Another way in which the effect of 
distribution of practice might have been 
manifested was in the Trials X Rests 
interaction. It has been proposed (13, 
p. 438f.) that for concept formation and 
problem solving massed practice may be 
facilitating early in learning whereas 
distributed practice may be beneficial 
later. The finding of the Trials KX Rests 
interaction as a significant source of 
variation for both number of correct 
responses and number of errors lends 
support to this hypothesis. The effect 
of interpolated rests was not apparent 
early inlearning. For the last few trials, 
however, better performance was as- 
sociated with the longest rest interval 
(60 sec.). The unusually poor per- 
formance of one of the groups tended to 
complicate this relationship however. 
The 4-bit irrelevant: 30-sec. rest group 
was uniformly poorer than the other 11 
groups. The degrading effect of this 
group is apparent in Fig. 1 which shows 


FREDERICK G, BROWN AND E. JAMES ARCHER 


the 4-bits curve nearer to the 6-bits curve 
than to the 2-bits curve, particularly in 
the later stages of practice. When the 
learning curves for the three degrees of 
distributed practice were compared, after 
elimination of the data for all of the 
4-bit groups, the interaction of stage 
and distribution of practice was clear. 
Distribution of practice leads to better 
performance late in learning (identifi- 
cation). The results further indicate 
that if a high criterion, e.g., 30 out of 32 
correct responses, had been required 
instead of a specific number of trials, the 
main effect of distribution of practice 
would probably have been found sig- 
nificant if the dependent response meas- 
ure had been the number of trials 
required to attain that criterion. 


SUMMARY 


The purposes of the present experiment were 
to study the effect of distribution of practice on 
a concept identification task, to explore further 
the effect of variations in task complexity, and 
to test for the interaction of these two inde- 
pendent variables. 

The task for S was to classify geometric 
patterns into four categories. Each of the 120 
Ss served individually, receiving 16 trials of 16 
patterns with the appropriate interpolated rest 
intervals. ‘There were 12 experimental groups 
corresponding to the combinations of distri- 
bution of practice (0-, 30-, or 60-sec. intertrial 
rest) and task complexity (0, 2, 4, or 6 bits of 
irrelevant information) conditions. Ten Ss 
served in each group, two Ss learning each of 
the five problems. 

The major conclusions were: (a) Varying the 
amount of irrelevant information in the stimulus 
was an effective way of varying task difficulty. 
(b) There was some tendeticy for the problems to 
differ in difficulty. The problems containing 
“positional” dimension were more difficult to 
identify. (c) Neither the main effect of distri- 
bution of practice nor the interaction with 
complexity was found significant. However, 
distribution of practice may have a facilitating 
effect if learning is continued until a difficult 
criterion is attained. 


REFERENCES 


1. Ancuer, F. J., Bourne, L. E., & Brown, 
F. G. Concept identification as a func- 
tion of irrelevant information and in- 
structions. J. exp. Psychol., 1955, 49, 
153-164. 





CONCEPT IDENTIFICATION 


. Coox, T. W. Massed and distributed 


Psychol. Rew., 


practice in puzzle solving. 
1934, 41, 330-355. 
Variability in attack in 


J. exp. 


. Ericxsen, S. C. 
massed and distributed practice. 
Psychol., 1942, 31, 339-345. 

. Fisnea, R. A., & Yares, F. 
tables for biological 
medical research. 
Boyd, 1953. 

. Garrett, H. E. Variability in learning 
under massed and spaced practice. J. 
exp. Psychol., 1940, 26, 547-567. 

. Grant, D. A., & Curran, J. Relative 
difficulty of number, form, and color 
concepts of a Weigl-type problem using 
unsystematic number cards. j. exp. 
Psychol., 1952, 43, 408-413. 

. Hemsrever, E. The attainment of con- 
cepts: II. The problem. J. gen. Psychol., 
1946, 35, 191-223. 


Statistical 
agricultural, and 
London: Oliver and 


321 
8. Huu, C. L. Principles of behavior. New 
York: D. Appleton-Century, 1943 
9. Mivver, G. A. 


urement? 


What is information meas- 
Amer. Psychol., 1953, 8, 3-11. 
10. Oseas, L., & Unperwoon, B. J. Studies of 
distributed practice: V. Learning and 
retention of concepts. J. exp. Prychol., 
1952, 43, 143-148. 

11. Ricnarpson, J.,& Bearcum, B.O. Distrib 
uted practice and rote learning in con- 
cept formation. J. exp. Psychol., 1954, 
47, 442-446. 

12. Tuxey, J. W. 
in the analysis of variance. 
1949, 5, 99-114. 

13. Unperwoop, B. f. pry- 
chology. New York: Appleton-Century- 
Crofts, 1949. 


Comparing individual means 
Biometrics, 


Experimental 


(Received October 17, 1955) 





Journal of Experimental Psycholog 
Vol. $2, io. 5, 1956 - , 


INVERTED-ALPHABET PRINTING AS A FUNCTION OF 
INTERTRIAL REST AND SEX 


E. JAMES ARCHER AND LYLE E. BOURNE, JR. 


University of Wisconsin ' 


The present study is concerned 
with the measurement and analysis 
of the time characteristics of the 
components of motor performance in 
inverted-alphabet printing. For so 
simple a motor skill two independent 
components of performance can be 
defined and measured: (a) the time 
required to print the individual let- 
ters, and (b) the time required to 
travel between successive letters. 

There are at least two response 
measures which are of theoretical 
interest for each of these components: 
(a) the mean responding time, and 
(b) the longest duration of continuous 
responding. The first pair might 


provide information on the nature of 


the facilitation which occurs with 
distributed practice in a motor skill. 
Previous studies (1, 7, 8, 9, 10) have 
shown that distributed practice is 
superior to massed practice for the 
inverted-alphabet printing task. The 
response measure, of course, was the 
number of letters printed per trial. 
However, no evidence has been ob- 
tained to indicate whether the better 
performance was due to faster printing 
or shorter interletter travel time or 
both. Similar questions might be 
asked about the phenomenon of 
reminiscence. 

Interest in the second pair of 
response measures centers about a 
test of a hypothetical phenomenon 
proposed by Kimble (8)—the “resting 
response.” It was stated that “The 


1 This research was supported by a grant from 
the Research Committee from the funds pro- 
vided by the Wisconsin Alumni Research 
Foundation. 


322 


accumulation of a certain (critical, 
threshold) amount of J,” will auto- 
matically produce resting”’ (8, p. 500). 
It is conceivable that S might emit a 
“resting response” while in the act 
of printing, i.e., would print very 
slowly, but it seems more probable 
that if he does rest, it will be during 
the traveling phase. In either case 
the “resting response” should appear 
as an unusually long duration of 
continuous printing or continuous 
traveling and with greater frequency 
in massed than distributed practice 
since more [x is believed to develop 
in the former condition. 

Buxton and Grant (6) have demon- 
strated that men are superior to 
women on the rotary pursuit task. 
It is conceivable that there might be 
a similar sex difference in inverted- 
alphabet printing and that this dif- 
ference might be reflected in the mean 
printing and traveling times. 


PROCEDURE 


Subjects.—A total of 84 right-handed volun- 
teer college students (42 men and 42 women) 
were randomly assigned to provide equal Ns in 
six combinations of sex and distribution of 
practice (men or women; and 0-, 30-, or 60 sec. 
intertrial rest). The Ss were introductory 
psychology students and were naive with respect 
to the purpose of the task. In an attempt to 
eliminate prior practice, Ss were asked not to 
talk to their friends about what they did in the 
experiment. 

Learning task and conditions.—The learning 
task was inverted-alphabet printing. The Ss 
were instructed as to the nature of the task and 
were told that certain letters of the alphabet, 
i.e., H, I, N, O, S, X, and Z, looked the same 
whether printed upside down or right side up. 
These latter instructions were given to reduce 
variability in performance as much as possible. 





INVERTED-ALPHABET PRINTING 


The Ss were instructed to print the alphabet in 
capital letters as quickly as possible from left to 
right starting in the upper left-hand corner, 
placing one letter in each of the .5-in. squares 
marked on the printing sheet. If they know- 
ingly made an error, they were to print right 
over the error and continue. 

The Ss serving in the O-sec. rest condition 
were told to skip three squares and continue with 
the next letter of the alphabet when given the 
command “skip.” This signal was given every 
30 sec., i.e., at the end of each trial. Those Ss 
who served in a distributed practice condition 
were told to stop printing at the end of each 30- 
sec. trial and to begin the rest-interval activity. 
Five seconds before the end of the rest period, 
Ss were given the command “ready,” at which 
time they were to interrupt their rest-interval 
activity, pick up the pencil used for alphabet 
printing, check to see what letter they should 
print next, and on the command “begin,” 
resume printing the alphabet, leaving three 
spaces between the last letter of the previous 
trial and the first letter of the new trial. The 


skipping of three squares enabled £ to identify 
each trial after S finished the experiment. 

All Ss received a total of 26 30-sec. trials. 
Two groups (one of men and one of women) of 
14 Ss each received the first 20 trials consecu- 
tively without a rest (massed practice). 


Two 
other groups, again one of each sex, had a 30-sec. 
rest between each of the first 20 trials. The 
two remaining groups of 14 Ss each had a 60-sec. 
rest between each of the first 20 trials. Fol- 
lowing Trial 20, all groups had a 5-min. rest 
after which all Ss had six additional 30-sec. 
trials with a 30-sec. rest between each. 

Rest-interval activity. —Unlike previous studies 
using this task (1, 7, 8, 9, 10), Ss in the present 
experiment served individually. Thus it was 
possible to provide better control over their 
rest-interval activity, which consisted of 
canceling certain letters of the alphabet from 
an apparently random series of letters. This 
activity was selected because of its face valid- 
ity as being equal in importance to the 
inverted-alphabet printing task. Since the 
letters to be canceled were typed in lower case 
and the letters to be printed upside down were 
in upper case, it seemed improbable that the 
rest-interval activity would transfer to the 
learning activity. 

Apparatus.—In order to record the afore- 
mentioned time characteristics, Ss printed the 
alphabet on a roughened aluminum sheet which 
had bees set flush with a 21 X 23-in. inclined 
table. The aluminum sheet had been scored 
with 1,035 half-in. squares arranged in a matrix 
of 29 rows of 35 squares. On the basis of some 
prior testing it had been found that this arrange- 


323 


ment of the matrix was most comfortable and 
accessible to Ss. A soft (No. 1) lead pencil was 
connected to a fine piece of insulated copper 
wire which was supported by a small gallows-like 
boom mounted at the center back of the inclined 
table. This support kept the connecting wire 
out of S’s way while he was printing. A 
different pencil was used for the rest-interval 
activity. The aluminum sheet and the wired 
pencil were part of a 6-v. AC circuit which closed 
a relay when S brought the pencil point in 
contact with the aluminum sheet. The signal 
from this relay was analyzed by a device which 
has previously been described and designated as 
a Tracking-time Analyzer (5). 


ResuLts 


For nearly all response measures 
men were significantly more variable 
than women. Because of this type 
of heterogeneity of variance it was not 
possible to apply any logical trans- 
formation. In order to use analyses 
of variance rather than a less powerful 
nonparametric test, the acceptable 
significance levels of the F ratios were 
raised by entering the table of the 
F distribution with one-half the 
number of degrees of freedom for 
the error term whenever it contained 
significantly heterogeneous com- 
ponents. 

Letters printed.—A simple analysis 
of variance indicated that the groups 
did not differ significantly (F = 1.37, 
P > .05, for 5 and 78 df) on the first 
trial. Although there seemed to be 
a slight sex difference in favor of 
women, the difference was not sig- 
nificant. 

The performance of each of the six 
groups is shown in Fig. 1. An 
analysis of variance of the prerest 
performance data is presented in 
Table 1. The term Ss/Groups is the 
proper error term for the three sources 
of variance shown above it. In spite 
of a significant heterogeneity of vari- 
ance (x? = 18.01 for 5 df, P < .01), 
it seems reasonable to conclude that 
there is a significant sex difference 












. 


LETTERS PRINTED 
££ B. 
ae 
® 
2 

f : 

° 

ez? 

a 
¢ eo > 
5-MIN. REST 


o- 0 
oO 30) woutn 


@@6o 
@@ 30 )ute 
ooo 


tee aad 
es wonwn eto #4 6 
TRIALS 





MEAN NUMBER OF INVERTED 





Fic. 1. Mean number of inverted letters 
printed for each 30-sec. trial and for each sex: 
distribution-of-practice combination. The 
treatment combinations are indicated by the 
key. The postrest condition for all Ss was 
distributed practice with a 30-sec. rest between 
each trial. 


(P < .05)—women can print more 
inverted letters per trial than men. 
As expected, the length of the rest 
interval was also found to be sig- 
nificant (P < .01). In addition to 
individual differences and practice, 
the slopes of the learning curves also 
differed significantly. This finding 
was indicated by the significant 
Trials & Rest interaction (P < .001). 


TABLE 1 


Awatysis or Variance ory Number or Lerrers 
Painren Duaine tue First 20 
Practice Trias 

















Source df MS P 
Sex ‘ 1 | 3,388.69 5.30* 
Rest Interval 215,100.73| 8.07** 
Sex X Rest 
Interval 2| 619.43 
Ss/Groups 78| 639.87) 89.01°** 
Trials 19| 937.19 | 130.37*** 
Trials X Sex 19 6.23 
Trials X Rest 38 29.96) 4,17*** 
Sex X Rest 
Trials 38 946 1,32 
Residual 1482 7A9 
Total 1679 | 
*P < OS. 
“PP <AO. 
“P< OO. 


E. JAMES ARCHER AND LYLE E. BOURNE, JR. 


For the first postrest trial Sex was 
the only significant source of vari- 
ation. Women were superior to men 
(F = 5.61, P < .05, for 1 and 36 df). 
Different prerest intertrial intervals 
did not lead to significantly different 
performance on the first postrest trial 
(F = 1.14, P > .05). 

Analysis of variance was performed 
on the difference scores of Trial 21 
minus Trial 20. This test indicated 
that the recovery from work decre- 
ment was an inverse linear function 
of the length of the prerest intertrial 
interval. Using orthogonal poly- 
nomials indicated that only the linear 
component of this function was sig- 
nificant (F = 42.98, P < Ol, for 1 
and 39 df). 

For all six of the postrest trials, 
only two sources of variation were 
significant—individual differences and 
the Trials X Rest interaction (F 
= 2.50, P < .O1, for 10 and 200 df). 

Printing time.—An analysis of vari- 
ance of the mean printing time on the 
first trial indicated that none of the 
sources of variation was significant. 

The mean printing times for the 20 
prerest trials are shown in the top of 
Fig. 2. Since the printing times for 
the two sexes do not differ signifi- 
cantly, the data have been combined 
in Fig. 2. In addition to the sig- 
nificant reduction in printing time as 
a function of practice (F = 37.64, 
P < .0O1, for 19 and 741 df) the only 
other significant source of variation 
was that of different intertrial rest 
intervals (F = 3.22, P < .05, for 2 
and 39 df). Printing time is an 
inverse function of intertrial interval. 

An analysis of printing-time differ- 
ence scores between Trials 20 and 21 
indicated only one significant source 
of variation—intertrial rest intervals 
(F = 9.14, P < .O1, for 2 and 78 df). 
The greatest gain in postrest printing 
speed occurred in the prerest massed 














practice group. Apparently reminis- 
cence can in part be attributed to a 
decrease in the time of the actual 
printing operation. 

For all six postrest trials, the only 
source of variation which was signifi- 
cant was that of practice (F = 5.69, 
P< Ol, for 5 and 390 df). In 
addition to a slight decrease as a 
function of the 5-min. rest, the 
printing time shows a further de- 
crease with postrest practice. 

Traveling time.—Performance on 
the first trial was analyzed for initial 
differences between the six groups in 
terms of the mean time to travel from 
one letter to the next, i.e., the time 
the pencil was not in contact with the 
aluminum sheet. The only source of 
variance which was significant was 
Sex (F = 10.32, P < Ol, for 1 and 
39 df); women show a shorter travel 
time than men. 

The mean travel times for each of 
the treatment combinations are shown 
in the lower half of Fig. 2. The 
results of an analysis of variance of 
the mean travel times are shown in 
Table 2. The two main effects of 
Sex and Rest Interval were highly 
significant and as usual the variances 
attributable to individual differences 
and to practice were also significant. 


TABLE 2 


Anatysis or VARIANCE OF Mean Travet Times 
ror THe Finst 20 Practice Triats 


us| FP 


a Source | af | 
Sex | 1 | 33.96 | 19.34°° 
Rest Interval 2 | 10.73 | 6.11" 
Sex X Rest Interval | 2) 3.26) 1.86 
Ss/Groups | 781 1.76 | 32.09% 
Trials 19 | 2.95 | 53.88°° 
Trials * Sex 19| 0.25) 4.51* 
Trials X Rest | 38) 0.07) 1.36 
Sex X Rest X Trials | 38) 0.04) 
Residual | 1482 | 0.05 | 


Total 1679 





*P <M. 
—-P < OM. 





INVERTED-ALPHABET PRINTING 














325 
g Cree vo eee 
as OF 4 
” ot Stee 
° 
us 40h soNes Satay: eesteel 
2 2s} PRINTING - 
20 a 
7 oo 
rnb = 30) men a 4 
[7a] * «6 
zit Aa... <0 z : 
Oo ’ > 50 vowtse | 
al OP tods o ©60/ > 4 
” oo} raA Os] 
© at Nett rsvig"*; 
pete Ta ree. ettess | 
z i 
S 2s} TRAVELING : 
2 »o eases 


2#¢468 08 #8 14 @ #8 t0 > =. = 


TRIALS 


Fic. 2. The upper half of this figure shows 
the mean printing time for each of the 20 prerest 
and 6 postrest trials. The data for the sexes 
have been combined since the difference be- 
tween them was not significant. ‘The lower half 
of the figure shows the mean traveling time for 
each trial for each of the sex:distribution-of- 
practice combinations. ‘The postrest condition 
for all Ss was distributed practice with a 30-sec. 
rest between each trial. 


An additional significant source of 
variation was the Trials K Sex inter- 
action. As indicated by Fig. 2, men 
show a greater decrease in travel time 
with practice than do women. 

An analysis of travel-time difference 
scores between Trials 20 and 21 indi- 
cated that only the length of the 


prerest intertrial rest interval was 
significant (F = 14.40, P < .O1, for 
2 and 39 df). Orthogonal poly- 


nomials indicated that both the linear 
and quadratic terms were significant. 
An analysis of travel time for all 
six postrest trials indicated that in 
addition to individual differences, 
Sex was the only other significant 
source of variation (F = 13.09, P 
< .01, for 1 and 39 df). Not only 
are women initially superior to men 
in terms of shorter travel time but, 
after 20 trials of practice and 5 min. 
of rest, they are still superior 
Durations of continuous responding 
—Although the Tracking-time Analy- 
zer provided a frequency distribution 











326 


of the durations of continuous travel- 
ing or printing, only the longest 
durations of these continuous re- 
sponses for each trial were analyzed. 
This response measure seems to come 
closest to a direct test of Kimble’s 
hypothetical “resting response.” 
Since the Tracking-time Analyzer 
could be used only to measure either 
traveling or printing time and not 
both simultaneously, it was necessary 
to obtain alternate-trial records for 
each S. For half of the Ss in each 
treatment combination (7 Ss) the 
frequency distributions of durations 
of traveling time were obtained on 
odd-numbered trials and the cor- 
responding measure for printing time 
was obtained on even-numbered trials. 
The opposite was true for the other 
half of the Ss. The E controlled the 
switchover from the recording of one 
response measure to another without 
S being aware of the change. 

The mean longest duration of con- 
tinuous travel for the men was 1.67 
sec. as compared to 1.35 sec. for the 
women (F = 13.62, P < Ol, for 1 
and 39 df). The mean longest dura- 
tions of continuous travel, 1.70, 1.50, 
and 1.33 sec. for the 0-, 30-, and 60- 
sec. intervals, respectively, also dif- 
fered significantly (F = 6.12, P 
< Ol, for 2 and 39 df). The mean 
duration of longest continuous travel- 
ing decreased during prerest practice 
from 1.96 sec. to 1.34 sec. (F = 8.84, 
P < OO1, for 9 and 351 df). The 


source of variance which would have 


supported Kimble’s hypothesis, 
namely, the interaction of Stage 
x Rest, was not significant (F 


= 1.13, P > .05, for 18 and 351 df). 
This interaction would have been 
significant (a) if the mean longest 
duration of continuous traveling for 
the massed-practice group did not 
decrease as rapidly with practice as 
those for the other two intertrial 








FE. JAMES ARCHER AND LYLE E. BOURNE, JR. 


intervals or (b) if the mean increased 
at some later stage of practice while 
the means for the other two groups 
either decreased or showed no change. 
Neither of these results appeared. 

Difference scores between pre- and 
postrest performance were analyzed 
to determine if reminiscence could be 
partly explained by a decrease in 
mean longest duration of continuous 
travel. None of the sources of vari- 
ance was significant. 

Besides individual differences, Sex 
was the only other significant source 
of variation (F = 11.11, P < .001, 
for 1 and 39 df) for all of the postrest 
trials. Both sexes showed a con- 
siderable decrease from their prerest 
performance in mean longest duration 
of continuous travel, but the women 
remained superior—1.32 sec. for men 
and 1.02 sec. for women. 

Although it seemed improbable that 
S would “rest” with his pencil in 
contact with the printing surface, the 
mean durations of longest continuous 
printing times were analyzed. In 
addition to individual differences, 
practice was the only significant 
source of variation (F = 5.89, P 
< .001, for 9 and 702 df). The mean 
duration of longest continuous print- 
ing decreased from 1.09 sec. to .84 sec. 


Discussion 


Contrary to what had been expected, 
women can print more inverted letters 
per trial than men, irrespective of the 
intertrial interval. Furthermore, women 
were superior not because they printed 
the letters faster but because they 
traveled between the letters faster. 
Women showed a shorter mean travel 
time on the first trial and continued this 
performance superiority throughout all 
stages of practice. Furthermore, the 
mean longest duration of continuous 
travel was significantly shorter for 
women than for men. Although the 
foregoing sex differences are of interest, 








it was the measures of responding times 
that were of theoretical importance. 

Kimble (9), using an inverted-alphabet 
printing task, proposed the hypothetical 
“resting response” to account for con- 
ditioned inhibition. To date there has 
been no direct evidence to support the 
existence of these “resting responses.” 
In the present study the duration of the 
longest continuous travel time (a score 
which was believed to be a direct measure 
of a “resting response’) was greater for 
the massed-practice group. However, 
it did not increase even when the con- 
ditions of extreme massing were con- 
tinued for several minutes; instead it 
decreased with practice. The massed- 
practice Ss served for 10 min. before 
receiving a rest; during that period both 
the mean travel time and the longest 
continuous travel time decreased at 
about the same rate for all sex-distri- 
bution combinations. If the Ss in the 
massed-practice conditions had made any 
“resting responses,” the effect would 
have appeared as a significant Trial 
X Rest interaction in Table 2 and a 
significant Stage X Rest interaction in 
the analysis of the continuous travel 
times. Only one conclusion seems indi- 
cated by the present data—“resting 
responses’ did not occur. 

Though it may be premature for a 
complex theoretical elaboration, the re- 
sults of the present and a previous study 
(5) suggest two possible effects of Jp. 
(a) When the responses required of the 


S are discrete, i.e., can be counted like: 


inverted letters of the alphabet, the 
performance of the response will be 
delayed by Zp and will be related in- 
versely to the length of intertrial rests. 
(6) When the required responses are 
continuous, ¢.g., rotary pursuit tracking, 
the quality of the response will be 
degraded by Jr because of interference 
with the performance of the skill and 
will be related directly to the length of 
intertrial rests. The quality of rotary 
pursuit tracking can be defined in terms 
of smooth, circular movements. 

Studies by Boldt and Ellis (4) on 
block turning and Bilodeau and Bilodeau 
(3) on cranking showed a slower rate of 





INVERTED-ALPHABET PRINTING 





327 


responding when the effort, and pre- 
sumably the Jp was increased. An 
earlier study by Bilodeau (2) also pro- 
vides evidence for the first suggestion. 
Although only one rest was interpolated, 
performance was improved immediately 
after the rest as measured by an increased 
rate of responding. 

There is no direct evidence currently 
available for the second suggestion aside 
from that provided by an earlier study 
(5). The best evidence would be a direct 
measurement of “erratic performance” 
following several minutes of continuous 
practice. Unlike the “resting response”’ 
it would not be necessary for this “‘er- 
ratic performance” to suddenly appear. 
Naive Ss on a task such as the pursuit 
rotor start out with erratic performance. 
If with massed practice the erratic per- 
formance does not disappear as rapidly 
as with training under distributed prac- 
tice, such evidence would be regarded as 
support for the second suggested effect 
of Ip. 

The proposed effects of Je do not 
appear to be inconsistent with the 
phenomenon of reminiscence. If after 
massed practice of a discrete motor task, 
S is permitted to recover from the de- 
laying effect of Jp, productivity will 
increase primarily because of a decrease 
in travel time. On the other hand, if 
S had been practicing a continuous motor 
skill, the recovery is in terms of improved 
quality of performance (5), e.g., smoother 
tracking with longer durations of time 
continuously on target. 


SUMMARY 


The results of the present study may be 
summarized by the following points: 


1. For the response measure of number of 
inverted letters printed per trial, women were 
superior to men and distributed practice was 
better than massed practice. 

2. For the response measure of mean printing 
time per letter, faster printing times were as- 
sociated with longer intertrial rest intervals. 

3, Women print more letters per trial than 
men not because they print faster but because 
they travel faster between letters. Shorter 
travel times were associated with longer inter- 
trial rest intervals. 











328 


4. There was no evidence for “resting re- 
sponses” as measured by the longest duration of 
continuous traveling or printing. The mean 
longest duration of continuous traveling for 
women was less than for men and less for dis- 
tributed than massed practice. 

5. Most of the improvement in performance 
after a 5-min, rest could be attributed to a 
decrease in traveling time. The gain in per- 
formance in terms of traveling time was an 
inverse nonlinear function of the length of the 
prerest intertrial interval. 

6. It is suggested that for discrete tasks Jp 
delays the response, and for continuous tasks 
the quality of the response suffers because of 
interference with the effector system. 


REFERENCES 


1. Ancner, E. J. Postrest performance in 
motor learning as a function of prerest 
degree of distribution of practice. J. 
exp. Psychol., 1954, 47, 43-51. 

2, Biropeau, E. A. Massing and spacing 
phenomena as functions of prolonged 
and extended practice. J. exp. Psychol, 
1952, 44, 108-113. 

3. Biropeau, Il. McD., & Bivopgau, E. A. 
Some effects of work loading in a repeti- 
tive motor task. J. exp. Psychol., 1954, 
48, 455-467. 


E. JAMES ARCHER AND LYLE E. BOURNE, JR. 


4. Bouwt, R. F., & Exuis, D. S. Voluntary 
rest pause behavior in a block-turning 
task as a function of wrist-cuff weight. 
J. exp. Psychol., 1954, 47, 84-88. 

5. Bourne, L. E., Jn., & Ancner, E. J. Time 
continuously on target as a function of 
distribution of practice. J. exp. Psychol., 
1956, $1, 25-33. 

6. Buxton, C. E., & Grant, D. A. Retro- 
action and gains in motor learning: II. 
Sex differences, and a further analysis of 
gains. J. exp. Prychol., 1939, 25, 198- 


7. Kientzte, M. J. Properties of learning 
curves under varied distributions of 
practice. J. exp. Psychol., 1946, 36, 
187-211. 

8. Kiwpte, G. A. Performance and remi- 
niscence in motor learning as a function 
of the degree of distribution of practice. 
J. exp. Psychol., 1949, 39, 500-510. 

9. Scnucxer, R. E., Stevens, L. B., & Exvis, 
D.S. A retest for conditioned inhibition 
in the alphabet-printing task. J. exp. 
Psychol., 1953, 46, 97-102. 

10. Wasserman, H. N. The effect of moti- 
vation and amount of pre-rest practice 
upon inhibitory potential in motor learn- 
ing. J. exp. Psychol., 1951, 42, 162-172. 


(Received October 17, 1955) 








Journal of Experimental Psychology 
Vol. $2, No. 5, 1956 


REDUCTION OF ERROR WITH PRACTICE IN PERCEPTION 
OF THE POSTURAL VERTICAL! 


CHARLES M. SOLLEY? 


The Menninger Foundation 


Whenever Ss make repeated judg- 
ments, as in many perceptual tasks, 
an experimenter is faced with the 
problem of whether there are signifi- 
cant changes in accuracy of those 
perceptual judgments as a function of 
continued practice. There are a few 
studies (3, 4, 5,6) which have demon- 
strated that there are substantial 
increases in accuracy of postural 
orientation in space through practice, 
under conditions where there are no 
visual cues for orientation. Though 
these studies were performed some 
thirty odd years ago, they seem to 
have been too far removed from the 
main current of psychological investi- 


gations to be widely known today. 


The term demonstration is used 
advisedly since these studies had so 
few subjects—as few as two or three— 
that little or no generalization can be 
made. However, if their results are 
basically correct, it would seem that 
they would have important impli- 
cations for designing experiments on 
spatial orientation. 

The present study was designed to 
investigate more systematically the 
possibility of improvement in ac- 
curacy of perception of the postural 


This report is part of a series of investi- 
gations conducted jointly with the School of 
Aviation Medicine and Research, under contract 
N7onr-434, Task Order I, with the Office of 
Naval Research in cooperation with the Bureau 
of Medicine and Surgery and The Tulane 
University of Louisiana. Project Designation 
Number NR140-455 of the Medical Sciences 
Division, Office of Naval Research. Project 
No. NM-001-037 of the Research Division, 
Bureau of Medicine and Surgery. Task Order 
Director: Cecil W. Mann. 

2 Formerly at Tulane University. 


vertical, i.e., alignment of the longi- 
tudinal axis of the seated body with 
the gravitational vertical. It was 
also designed to obtain some quanti- 
tative indication of the magnitude 
of this improvement, provided it 
exists. 

In order to separate the effects of 
making repeated judgments of the 
postural vertical from other possibly 
confounding variables it is desirable 
to hold constant or to eliminate other 
factors which are known to influence 
accuracy of perception of the postural 
vertical, At present it seems that 
postural orientations in space depend 
upon cues from the visual frame of 
reference and upon cues from the 
somaesthetic complex. We can elimi- 
nate the visual frame of reference by 
both blindfolding S and having him 
make his judgments in a completely 
darkened room. We can “control” 
effects of the somaesthetic complex 
though never completely—by holding 
S’s head in alignment with his body, 
by always tilting S a constant amount 
in a given direction before having him 
make his judgments, and by holding 
as constant as possible the amount of 
time S is delayed in a tilted position 
before making his judgment. Per- 
haps the best way to keep the head in 
alignment with the body is by means 
of a biteboard (1, 2) or some brace 
mechanism (9) during the course of 
the experiment. 

For the present study the following 
hypotheses were proposed for testing. 
Hypothesis I was that there would be 
a systematic decrease in average error 
of judgments of the postural vertical 


329 





330 


as a function of repeated judgments 
(practice). Hypothesis I] was that 
there would be no difference between 
groups tilted to the right (O-R) and 
groups tilted to the left (O-L). That 


is, the direction of body tilt before 
taking a judgment of the postural 
vertical should not affect improve- 
ment in accuracy of perception of the 
postural vertical. 


Mertuop 


Apparatus.—A large wooden chair built for 
the ONR-Tulane Joint Research Project was 
used, This chair could only be tilted in the 
lateral plane. This apparatus has been fully 
described elsewhere (8). Its movements could 
be controlled either by E or by S at the dis- 
cretion of the former. A single key, which was 
mounted to the right arm of the chair, allowed 
S to manipulate the chair. If the key was 
moved to the left, the chair moved to the left; 
if the key was moved to the right, the chair 
moved to the right; if the key was placed upright, 
the chair stopped. 

Two rooms were used. The room in which 
S sat in the chair was completely darkened. 
The room in which E recorded judgments and 
directed the movements of the chair was con- 
nected to the apparatus in S’s room by means of 
electric cables. ‘These cables connected selsyns 
which were mounted with the tilt chair to slave 
selsyns in £’s room which indicated on a dial 
how many degrees S was off true vertical when 
he judged that he was upright. 

Procedure.—Fach S was seated in the tilt 
chair and was given the following instructions. 

“This is an experiment to see how well you 
can perceive the postural vertical, that is, how 
well you can tell when your body is upright with 
respect to the outside world. ‘To find out how 
well you can place your body in a vertical 
position, the following procedure will be used. 
I will place you in the position that you now 
occupy, that is, upright. I will say ‘Upright’ 
to let you know that you are in that position. 
Then I will move the chair in which you are 
sitting to a position of tilt. As soon as you are 
placed in that position of tilt, I will say ‘Return 
yourself.’ You are to immediately bring your- 
self back up, by means of this key, to a position 
where you feel that your body is seated verti- 
cally. If you move this key to the right, you 
will move to the right; if you move it to the left, 
you will move to the left; if you place it upright, 
the chair will stop. As soon as you feel that 


CHARLES M. SOLLEY 


you are seated upright say ‘Now.’ This will 
let me know that you have made your judgment. 
I will then bring you to vertical and the next 
trial will begin. Bite on the biteboard at all 
times. Are there any questions as to what you 
are to do?” 

All questions were answered and, when 
necessary, the instructions were repeated. No 
practice was given with the key and chair 
before the experiment began. As soon as the 
instructions were given, S was blindfolded, the 
room was darkened, and the experiment began. 
A series of 30 trials was given in sessions of 10 
trials. There was a set time interval of 5 sec. 
between trials and a rest interval of 60 sec. 
between sessions. During the rest period E 
asked S what cues S was using in making his 
judgments and recorded whatever S reported. 

A trial was as follows: E placed S in line with 
true vertical and said “Upright.” Five seconds 
later E tilted S to a position of 30° lateral tilt, 
to the right or to the left depending on which 
condition S had been assigned. The chair in 
which S sat was offset from 5° to 15° in the 
opposite direction from the final position of tilt 
before being moved to that position. The £ 
said “Return yourself” and S immediately re- 
turned himself. The S made his judgment 
and said “Now.” ‘The E immediately offset S 
and by a series of random movements returned 
S to true vertical at which time E said “Up- 
right.” Approximately 5 sec. later the next 
trial began. On each trial E recorded (a) the 
number of degrees S was off true vertical to the 
nearest half degree, and (b) the seconds required 
to make the adjustment. 

A check was made on machine error—intro- 
duced largely through lag in gears—by leveling 
the chair with a carpenter’s level and then 
reading the degrees off true vertical on E's dial 
to the nearest .1° (by interpolation). The 
average machine error was .38°, which meant 
that any reading less than a half degree could 
only be considered machine error. 

Subjects.—Thirty-four males were used as Ss. 
All were undergraduates at Tulane University. 
A “sense of balance” test was given each S. 
This consisted of having S stand on one leg, with 
his eyes closed, for 30 sec. If S did this two 
times out of three without losing his balance, he 
was accepted asan S. Only one S failed to meet 
this balance criterion. Another S, however, 
had to be removed from the chair after com- 
pleting only 12 trials when he complained of 
feeling ill and refused to go on. The Ss were 
randomly assigned to the 0-R and 0-L groups 
as they arrived at the experimental scene. In 
this manner 18 Ss were assigned to the 0-R 
group and 16 to the 0-L group. 





PERCEPTION OF THE POSTURAL VERTICAL 


ReEsuULTS 


Since no theoretical function had 
been hypothesized for the reduction 
in error of perception of the postural 
vertical with practice, a distribution- 
free analysis of practice effects was 
carried out. Each S’s score was the 
total number of degrees he was off 
true vertical in blocks of five trials. 
Thus, each S had six consecutive 
scores. A Friedman y?, test (10) 
was carried out for the O-R and O-L 
groups separately. For the O-R 
group this x*, for the between trials 
effect was 22.36, for 5 df, which was 
significant at the .0O1 level. For the 
O-L group, the between trials x’, was 
34.99, df = 5, which was also signifi- 
cant at the .OO1 level. No significant 
differences were found between the 
O-R and the O-L groups on any trial 
or over all trials. Pooling both 
groups together, the x’, for the be- 
tween-trials effect was 41.55, df = 5, 
which was significant at the .0O1 level. 
Thus we can conclude that there is a 
significant difference between  suc- 
cessive blocks of five trials with 
respect to the accuracy of S’s judg- 
ments of the postural vertical. Table 
1 summarizes the average degrees 
error in perception of the postural 
vertical per trial in blocks of five trials 
This table shows that there is a fairly 
systematic reduction in the error of 
perception of the postural vertical 
with practice. Thus, we can conclude 


TABLE 1 


Mean Decrees Error in Percerrion or THE 
PosruraL Vertical rer TRIAL IN 


Biocks or Five Triars 
, 


Trial Blocks 
Group | N - » 
| | 1-5 | 6-10 11-15 16-20 21-25]26-30 


3,00) 1.85) 1.51) 1.19) 1.34) 1.03 


O-L, | 16 19 
O-R | 18 | 2.49) 1.87) 1.60) 1.21) 1.32) 1.42 


_" 
OR om" Ee SGT 
OL +— €.a57 7 ™ 


MEDIAN AVE. ERROR 








Fic. 1. Changes in median average error 
(E) of adjustment to the postural vertical as a 
function of trials practice (T) for 0-L and O0-R 
groups. 


that Hypothesis I has been sub- 
stantiated, 

In order to ascertain which func- 
tional relationship best described the 
data, the “reduction process” out- 
lined by Lewis (7) was used. Test 
plots were made for parabolic, hyper- 


bolic, exponential, and logarithmic 


curves, and the hyperbolic function 


was found to best fit the data. That 
is, the function Y = AX~ yielded 
a minimum residual variance where 
Y is the median degrees Ss were off 
true vertical and X is the trial num- 
ber. (This was the same function 
found to best fit data similarly col- 
lected by Mann, Solley, and 
Corrigan.*) For Group O-R the fitted 
curve was Y = 3.16X~%™. For 
Group O-L the fitted curve was 
Y = 4.57X-™. These fitted curves 
and the empirical points are shown in 
Fig. 1. As can be seen, the empirical 
points deviate very little from the 
fitted curves, indicating a fairly good 
fit. 

The adjustment-time scores also 
showed a slight decrease with prac- 
tice, i.c., Ss were making their judg- 
ments slightly faster at the end of 30 


*C. W. Mann, C. M. Solley, & R. F.. Corrigan. 
Unpublished study. 





332 


trials than they were initially. How- 
ever, the Friedman test (10) showed 
no significant trend in the adjustment- 
time scores. Also, there was no 
significant difference between the O-R 
and the O-L groups with respect to 
adjustment-time scores. 

During the l-min. rest period given 
at the end of every 10 trials E had 
asked S to report what kind of cues 
he was using to estimate when he was 
seated upright. Two broad cate- 
gories of cues were reported. The Ss 
reported that they used “leg and arm 
balance,” “shift of weight on but- 
tocks,”” or “don’t know.” The Ss 
who reported use of “arm and leg 
balance” initially but who shifted: to 
use of “shift of weight on buttocks” 
invariably showed improvement. 
The Ss who continued reporting use 
of “leg and arm balance” throughout 
the 30 trials showed no improvement. 
The Ss who reported using “shift of 
weight on buttocks” throughout the 


30 trials did better than Ss reporting 
use of “leg and arm balance” through- 
out but showed only a slight improve- 


ment. No S reported use of “shift 
of weight on buttocks” at first with 
subsequent shift to use of “leg and 
arm balance.” 


Discussion 


The results showed that improvement 
in accuracy of perception of the postural 
vertical did occur with practice, in spite 
of the fact that Ss (a) were not told the 
magnitude or direction of their errors, 
(4) were misled as much as possible as to 
time required to get to the final position 
of tilt, and (c) were returned to true 
vertical through a series of random 
movements. This reduction in error 
occurred with respect to the number of 
degrees Ss were off true vertical when 
they perceived themselves upright. The 
phrase “reduction of error” rather than 
“learning” is used deliberately since E 
cannot think of any kind of “reinforce- 


CHARLES M. SOLLEY 


ment,” “reward,” or 
results,” which 
might imply. 

Though we still do not know why Ss 
shift cues, there is some evidence (verbal 
reports) that they do and that there is 
some kind of relationship between shift- 
ing of orientation cues and reduction in 
error of perception of the postural 
vertical. The shift from using “leg and 
arm balance” to “shift of weight on 
buttocks” seems to indicate that Ss felt 
that the latter was a more stable source 
of cues. This reporting of cues, however, 
should not be interpreted as meaning 
that only these two sources of cues were 
used, since it is extremely difficult to 
get Ss to report their use of cues. It is 
quite possible that other kinds of cues 
were utilized but were not reported. 

On the basis of these results it seems 
that use of so-called “random orders” of 
presentation of conditions in kindred 
perceptual experiments should be looked 
upon with suspicion. Any “random” 
order that we can generate is a short, 
finite sequence and definite biases are 
generated. It would appear that a 
systematic order of presentation would 
be better since one could then analyze 
the variance attributable to the per- 
ceptual variables independent of learning 
biases. By either keeping experimental 
conditions independent by using different 
Ss in each condition (which is often time- 
consuming and expensive as a procedure) 
or using a systematic order of presen- 
tation of conditions which is counter- 
balanced, one can largely remove the 
danger of having effects of his perceptual 
variables confounded with learning 
effects. 


“knowledge of 
the term “learning” 


SUMMARY 


It was hypothesized that Ss improve with 
practice in their accuracy of perception of the 
postural vertical. Two groups were used, Ss in 
one group being tilted 30° laterally to the left 
and Ss in a second group being tilted 30° 
laterally to the right. Each S had to return 
himself to the point where he perceived himself 
as aligned with true vertical on each trial. 
Thirty such trials were given each S. A de- 
crease in (a) average number of degrees S was 
off true vertical and (b) time required to make 





PERCEPTION OF THE POSTURAL VERTICAL 


adjustments was found though only the former 
was Statistically significant. 


REFERENCES 


1. Bourpon, B. La perception de la verticalité 
de la téte et du corps. Rev. Phil., 1904, 
57, 462-492. 

2. Femcuenreto, H. Zur Lageschatzung bei 
seitlichen Kopfneigungen. Z. Psychol., 
1903, 31, 127-150. 

3. Fiscner, W. Das Errinnerungsvermégen 
an bestimmte Lagen im Raume und seine 
weitere Ausbildung durch Uebung. Z. 
Biol., 1922-23, 77, 1-10. 

4. Garten, S. Uber die Grundlagen unsurer 
Orientierung im Raume. Abh. sachs. Ges. 
(Akad.) Wiss., 1920, 26, 433-510. 

5. Kremnxnecut, F. Ein weiterer Beitrag zur 
Frage des Ubungseinflusses und der 
Ubungsfestigkeit am Neigungsstuhl. Z. 
Biol., 1922, 77, 11-28. 


333 


6. Kremxnecat, F., & Luec, W. Weitere 
Untersuchungen iber Lagen-Gedichtnis 
und Empfindung em Neigungstuhl. Z. 
Biol., 1924, 81, 22-36. 

7. Lewis, D. Quantitative methods in psy- 
chology. lowa City, lowa: The Bookshop, 
1948. 

8. Mann, C. W., Bertnetot-Berry, N. H., 
& Daurerive, H. J., Jr. The perception 
of the vertical: I. Visual and non- 
labyrinthine cues. J. exp. Prychol., 1949, 
39, 538-547. 

9. Warner, S., & Weanen, H. 
on sensory-tonic field theory of per- 
ception: V. Effect of body status on the 
kinaesthetic perception of verticality. /. 
exp. Psychol., 1952, 44, 126-131. 

10. Wixcoxon, F. Some rapid approximate 
statistical procedures. Stamford, Conn.: 
American Cyanamid Co., 1949. 


Experiments 


(Received October 26, 1955) 





Journal of Experimental Psychology 
Vol. 52, No. 5, 1956 


LISTENING TO OVERLAPPING CALLS! 


E. C. POULTON 
Applied Psychology Research Unit, Cambridge, England 


It was aimed to determine the effect 
of two competing messages, which 
were presented simultaneously from 
different loud-speakers. Effects of 
the following independent variables 
were explored: (a) whether the com- 
peting messages started simultane- 
ously, or whether one started before 
the other; (b) whether a response had 
to be made to both messages, or only 
to one of them; (c) the amount of 
additional work which S had to 
undertake; (d) the over-all frequency 
of communication from the loud- 
speakers; (¢) the number of loud- 
speakers which presented messages 
only occasionally ; and (f) the arrange- 
ment of the loud-speakers. Results 
were negative on this latter score, and 
more suitable experiments were car- 


ried out subsequently to investigate 


the effect (2). 


Metuop 


Experiment without conversations.—Two 
readers recorded calls on a magnetic-tape re- 
corder (Ferrograph) with two sound tracks. 
The calls involved one of four control towers, 
and were of the form: “Lakenheath tower, | 2 3, 
over.” During a period of 20 min. each control 
tower received (see Table 1): (a) one pair of 
synchronous calls; (b) one pair of overlapping 
calls, in which the aircraft number of the first 
call (first overlap) synchronized with the tower 
name of the second call (second overlap) ; (¢) two 
calls each of which synchronized with a call to 
another tower; (d) two calls each of which over- 
lapped a call to another tower; one call was the 
first of a pair, the other was the second; and (¢) 


This research was under the general di- 
rection of Professor Sir Frederic Bartlett F.R.S. 
and Dr. N. H. Mackworth. It was carried out 
in close cooperation with Mr. D. E. Broadbent. 
The Ss were supplied by the Royal Navy. 
Financial support from the British Medical 
Research Council is also gratefully ac- 
knowledged. iy 


two or 32 separated calls; two towers received 
the small number, the other two towers received 
the large number. The quiet interval between 
calls or pairs of calls was varied between 1.0 
and 30 sec. 

The recording was presented through two or 
four loud-speakers. Two synchronous or over- 
lapping calls were always presented from dif- 
ferent loud-speakers by different voices. The 
S was instructed to listen for the calls to one 
particular control tower, and to neglect the calls 
to the three other towers. When he heard a 
relevant call, he had to write down the aircraft 
number, which consisted of three digits, and the 
time of the call in minutes and seconds to the 
nearest 5.0 sec. 

In two experimental periods there were two 
loudspeakers, In one period they were placed 
one on top of the other in front of S. In the 
second period they were placed front right and 
front left of S, with a horizontal angular separa- 
tion of approximately 90°. In a third period 
there were four loud-speakers, but calls could 
only come from two at a time. The particular 
combinations varied, and could not be deter- 
mined by S in advance. Two loud-speakers 
were placed front right and front left of S, as in 
the second period; the other two were placed 
intermediately, so that the horizontal angular 
separation of each loud-speaker from the next 
was approximately 30°. ‘The mean intensity of 
calls was about 60 db when they reached S. 

The 12 Ss were tested in groups of four. 
They received a period of instruction and 
practice on one day, which lasted about 2 hr. 
In this period they worked with calls similar to 
those used in the experiment. They then re- 
ceived three experimental periods, each of which 
lasted 1 hr., on the following days. The effects 
of practice were equated by the use of latin 
squares and balanced groups. 

Experiment with conversations.—In many 
respects this was similar to the previous experi- 
ment. Calls to the four control towers from 
visiting aircraft were recorded intermittently by 
one reader on the “idle” sound track of the tape 
recorder. On the “busy” track, the same reader 
and a different reader recorded conversations, in 
addition to calls from visiting aircraft. Com- 
munication was continuous on this track. The 
conversations were read at a mean rate of three 
per minute. They involved a control tower and 
three aircraft in its vicinity. They were modeled 


334 





LISTENING TO OVERLAPPING CALLS 


upon procedures used in the Royal Air Force. 
For example: “Lakenheath tower, 2 1 0, taxi 
clearance, over.” “2 1 0 Lakenheath tower, 
clear to taxi, runway 25, Queenie Nan How 1003 
millebars, 30 feet, over.” “2 10 out.” Each 
conversation contained the tower name twice on 
the average. It gave one piece of information 
about the movement of one of the three aircraft, 
which was always repeated back. In the first 
and third quarters of an experimental period, 
the conversations all involved one tower. In 
the second and fourth quarters they involved 
another tower. The calls to the same tower 
and to other towers from visiting aircraft, were 
recorded on the busy sound track between the 
conversations. 

The calls from visiting aircraft on the two 
sound tracks were related in time, as in the 
previous experiment. In each quarter of an 
experimental period, each of the four towers 
received (see Table 2): (a) one call on the busy 
sound track, which synchronised with a call to 
another tower on the idle track; (b) one call on 
the idle track, which synchronized with a call 
to another tower on the busy track; (c) two calls 
on the busy track, each of which overlapped a 
call to another tower on the idle track; one call 
was the first of a pair (first overlap), the other 
was the second of a pair (second overlap); (d) 
two calls on the idle track, each of which over- 
lapped a call to another tower on the busy 
track; and (¢) two separated calls, one on the 
busy track, the other on the idle track. Sepa- 
rated calls on the idle track synchronized with 
conversation on the busy track. Messages on 
the same track never overlapped. 

The recording on the busy sound track was 
presented through a single busy loud-speaker, 
while the recording on the idle track was pre- 
sented through one or three idle loud-speakers. 
In each of four experimental periods S had a 
different task to perform, which corresponded to 
the control tower he was allocated (see Table 3). 
In one period he had to record visitors’ calls to 
his tower, which were presented only from the 
idle loud-speaker(s); he could neglect the busy 
speaker completely. In another period he had 
to record visitors’ calls which were divided 
equally between the busy and idle speaker(s). 
The task in a third period was identical, except 
that the conversations in two quarters involved 
his tower. In these quarters he had to record 
the number and time of each of the three aircraft 
holding the conversations, on the first occasion 
upon which it called the tower, in addition to 
noting the calls from visiting aircraft. The task 
in the fourth period was again identical, but was 
combined with listening to the conversations to 
his tower. To ensure that he did listen, he was 
provided with a board upon which were repre- 


sented the airdrome circuit and approaches, and 
clear celluloid discs for aircraft. He had to 
number the discs to correspond to the three 
aircraft, and to move them round the board 
according to the conversations, as a controller 
does in order to assist his memory. However, 
he was told that his principal task was to record 
the calls from visitors. 

There were four arrangements of loud- 
speakers, which corresponded to the three 
arrangements of the previous experiment. In 
one experimental period the busy speaker was 
placed directly on top of or underneath a single 
idle speaker. In a second period these two 
speakers had a horizontal angular separation of 
90°. In the two remaining periods there were 
three idle speakers and one busy speaker. The 
busy speaker occupied one of the two central 
positions in one of these periods, and one of the 
two peripheral positions in the other period. 
The four experimental periods each lasted 1 hr. 
There were 16 new Ss. 

Subjects. —The 28 Ss were all enlisted men in 
the British Royal Navy, aged between 18 and 24 
yr. The groups performing the two experiments 
were of approximately similar composition. 

Scoring and calculations.—In the experiment 
with conversations, errors on aircraft numbers 
were classified as omissions or wrong numbers 
There were too few numbers recorded at times 
when no call was made to the tower to allow 
statistical treatment (they comprised only 7% 
of all the errors in this experiment). ‘They were 
therefore neglected. The accuracy with which 
S followed the conversations on the board was 
checked by E at five fixed times during each of 
the two relevant quarters. One point was given 
for each of the three aircraft discs which was 
correctly positioned at these times. 

In the experiment without conversations, 
errors on aircraft numbers were not subdivided. 
For there were over six times as many wrong 
numbers as omissions, and most of the omissions 
were cases in which S heard the number so 
indistinctly that there was little chance of a 
guess being correct. One third of the omissions 
were in fact correctly recorded as such, although 
specific instructions were not given to do this. 
And the distribution of this third differed in no 
way from the distribution of the remainder of 
the omissions. The statistical methods of de 
termining significance were the same as those 
described in the previous paper (2). T'wo-tailed 
tests have always been used. 


RESULTS 


Table | shows the effects of overlap 
in the experiment without conver- 


sations. The S hardly ever failed to 





336 


TABLE 1 


Evrects oy Overtap in Exreriment Wirnout 
ConVERSATIONS 








1 100 Calls when 
apping Calls to: 


Errors 

Overla 
Type of 
Overlap 





Different 


Towerst 
Synchronous 44.5 8.8 
First overlap 13.9 3.7 
Second overlap 12.0 : 
Separated A 


Same Tower* 











* Combined First and Second overlap different from 
Ppmgivenene at .O1 level, and from Separated at 001 


1 Synchronous different from combined First and 
panes @ overlap at .05 level, and from Separated at 001 
ve’ 


Same Tower—-Different Towers P = .001, 


Combined First and Second Overlap, Same Tower 
Different Towers P = 01 


realize when his tower was called, 
Over half the errors were confusions 
between the digits of two aircraft 
numbers which occurred simultane- 
ously or in close succession. When 
both calls were to the same tower, 
there were rather more errors on both 
calls together than would be predicted 
from the number of errors on either 
one of the calls alone. However, the 
trend was not significant (P > .05). 
Table 2 shows the effects of overlap 
in the experiment with conversations, 
when the task performed by S was 


E. C. POULTON 


comparable to that of the previous 
experiment (all speakers in use to be 
monitored for calls, conversations with 
another tower). In the present ex- 
periment pairs of calls were always to 
different towers. Wrong numbers 
and omissions were distributed inde- 
pendently. Only 3% of the omissions 
were correctly recorded as such. 
About half the wrong numbers were 
confusions between the digits of two 
aircraft numbers. 

The effects of the task to be per- 
formed are shown in Table 3. When 
the conversations from the busy 
speaker had to be followed, separated 
calls accounted for over half the 
omissions from the idle speaker(s). 
An average of 90% of the conversa- 
tions were followed correctly. 

In neither experiment did variation 
in the number of idle speakers, nor 
in the arrangement of the speakers, 
give any significant differences. 
There was no significant difference 
between readers, and practice had no 
effect upon the results. In the ex- 
periment without conversations, the 
number of separated calls to be noted, 
and the time interval between calls 
or pairs of calls, also did not affect 
the results. 


TABLE 2 


Errects or Overtap in Experiment witn Conversations 








Type of overlap 


Wrong Numbers per 100 Calls 


Omissions per 100 Calls 





Busy Speaker*® 


Idle Speaker (s)t 


Busy Speaker? Idle Speaker (st 





Synchronous 10.9 
First overlap 29.7 
Second overlap 8 
ee 1.6 














Note. Pairs of calle were always to different towers. 


tower called), conditions were comparable to those of Table 
* First overlap different from both Second overlap and Separated at .01 level. 


Second overlap and Separated at .05 level. 


Except for the conversations (which did not involve the 
1. 


Synchronous different from both 


1 None “7 the differences between overlaps with the Idle Speaker(s) is | (P > 05). 
Separated different from both Synchronous and Second overlap at .02 


{ey Speaker—Idle Speaker(s) P = .02 or better. 
Wrong Numbers—Omissions P = .01. 





LISTENING TO OVERLAPPING CALLS 


TABLE 3 


Errects or Task 1n Experiment witn Conversations 





Wrong Numbers per 100 Calls 


Omissions per 100 Calls 





Busy Speaker 


Idle Speaker(s) | Busy Speaker | Idle Speaker (s)* 





Idle loud-speaker(s) only to be 
monitored 
Busy and idle loud-speakers to be 
monitored : 
Conversations with another tower 
Conversations with the same tower, 
to be: 
Neglected 7.0 
Followed 10.9 


10.7 





3.1 1.2 


4.5 , 1.2¢ 


5.5 : 8 
7.8 : 10.9 














* Neglected-—-Followed P = .05. None of the other differences within columns is significant (P > .05). 


t Wrong Numbers—Omissions P = .05 


Discussion 


Calls from idle loud-speaker(s) captured 
attention from busy loud-speaker.—lIf the 
presence or absence of conversations to 
another tower had made no difference, 
the proportions of wrong numbers in 
Table 2 for the busy loud-speaker would 
have corresponded to the proportions 
for the idle loud-speaker(s), and also to 
the proportions of errors in Table 1 on 
calls to different towers. In addition, 
there would have been no great propor- 
tion of omissions in Table 2 in any 
condition, since there were practically 
no genuine omissions in the experiment 
without conversations. The results dis- 
prove this null hypothesis. 

On first-overlap calls the busy speaker 
produced significantly more wrong num- 
bers than the idle loud-speaker(s), and 
also significantly more omissions (Table 
2). The proportion of wrong numbers 
from the idle joud-speaker(s) is about 
the same as the proportion of errors for 
the different-towers condition in Table 
1. The great increase in wrong numbers 
from the busy loud-speaker, and the 
smaller increase in omissions, suggest 
that § often failed to take note of the 
aircraft number as it was called out. 
This was because it synchronized with 
the start of a call from an idle loud- 
speaker, which tended to draw his 
attention away from the busy speaker. 
About 20% of the omissions were in fact 


correctly recorded as such, although 
specific instructions were not given to do 
this; they comprised all except one of the 
omissions which were correctly recorded 
in this experiment. The S knew he had 
been called, for the name of his tower 
was presented from the busy loud- 
speaker before the distraction occurred 
from the idle loud-speaker. 

On synchronous and _ second-overlap 
calls, the busy loud-speaker produced 
significantly more omissions than the 
idle loud-speaker(s). With both types 
of call the proportions of wrong numbers 
were about the same for the busy and 
idle loud-speakers. These proportions 
were also about the same as the cor- 
responding proportions of errors for the 
different-towers condition in Table 1, 
The increase in omitted calls from the 
busy loud-speaker, without any change 
in wrong numbers, suggests that in these 
cases S often failed to notice that the 
call was for him. This was because the 
name of his tower was presented from 
the busy loud-speaker either just after a 
different tower had been called by an 
idle loud-speaker, or at the same time. 
The irrelevant call tended to draw his 
attention away from the busy loud- 
speaker, either just before or as his tower 
name was presented. 

The sizes of the increases in errors on 
synchronous and overlapping calls from 
the busy loud-speaker, indicate the great 





338 


forcefulness of the demands for attention 
which were exerted by the calls from the 
idle loud-speaker(s). Yet these calls 
from the idle loud-speaker(s) were not 
to S’s tower. Nowhere near such large 
proportions of errors were found under 
the condition of one busy and one idle 
loud-speaker in the previous paper (2, 
Table 1). For in this previous condition 
calls from the two loud-speakers never 
overlapped. Thus after a call from the 
idle loud-speaker, there was always time 
for attention to return to the busy loud- 
speaker before a call was presented from 
it. 

Confusion between synchronous num- 
bers.—Table 1 shows the effect of inter- 
ference, uncomplicated by the presence 
of a busy loud-speaker. Under these 
conditions there was no significant dif- 
ference between first- and second-overlap 
calls. There was also no significant 
difference between first-overlap calls to 
different towers, and separated calls. 
Yet with first-overlap calls, the aircraft 
number synchronized with the name of 
the tower in the second call. Whereas 


with second-overlap calls and separated 


calls the aircraft number was unmasked. 
Thus a number was not very effectively 
masked by a name presented from a 
different loud-speaker. 

However, synchronous calls, in which 
two aircraft numbers synchronized, gave 
significantly more errors than .over- 
lapping calls. Thus numbers interfered 
with numbers more effectively than 
names masked numbers. This addi- 
tional interference has been labeled 
confusion (1), to distinguish it from 
masking in the sense in which this term 
has generally been used. There was no 
confusion with first-overlap calls, because 
S was listening for an aircraft number; 
he was not listening for the name of the 
tower which synchronized with it. 
Whereas with synchronous calls § was 
listening for a number, and it was 
another number which synchronized 
with it. 

Following the conversations restricted 
attention.—Table 3 shows that when the 
conversations from the busy speaker had 
to be followed, significantly more calls 


E. C. POULTON 


from the idle loud-speaker(s) were omit- 
ted than when the conversations could 
be neglected. Over half the omissions 
were on separated calls, which always 
occurred in the middle of conversations. 
The remainder were on calls between 
conversations, while § was presumably 
still preoccupied with the previous 
conversation. | 

Table 3 also shows that calls from the 
busy loud-speaker, which were always 
presented between conversations, were 
not omitted any less frequently when the 
conversations had to be followed. After 
S had dealt with a conversation, a call 
from an idle loud-speaker still tended to 
take priority over a call from the busy 
speaker. 

Silent loud-speakers were neglected.— 
Performance showed no difference in 
either experiment whether the recordings 
on the two sound tracks were presented 
over two loud-speakers, or spread out 
over four. Attention was not normally 
given to a loud-speaker when it was 
silent. The arrangements of loud- 
speakers also showed no significant 
effects. However, in another experi- 
ment with two busy loud-speakers (2, 
Table 3), significant differences were 
found between the two loud-speakers 
placed one on top of the other, and the 
two loud-speakers separated hori- 
zontally. 


SUMMARY 


Two competing calls were presented simul- 
taneously from different loud-speakers. Each 
call contained a three-figure number, and S had 
to write down the number if it was preceded by 
his particular call sign. In one of the experi- 
ments all the loud-speakers were idle. In the 
other experiment one loud-speaker was busy all 
the time. On this loud-speaker the interval 
between calls was occupied by conversations, 
which resembled the calls in certain respects. 

In the experiment with conversations, it was 
found that a call from an idle loud-speaker 
tended to capture S’s attention. This occurred 
even when the call was irrelevant, and S knew 
that he was being called by the busy loud- 
speaker. When the conversations had to be 
followed, calls from idle loud-speakers tended to 
be missed. 





LISTENING 'TO OVERLAPPING CALLS 


In the experiment without conversations, 
there were more errors when two numbers 
synchronized, than when a number was masked 
by a name. Over half the errors were con- 
fusions between the two numbers. There were 
more errors when the numbers from both the 
competing calls had to be recorded, than when 
only one of the numbers had to be recorded. 

Variation in the number of idle loud-speakers 
was found to make 
experiment. Certain 


in either 
arrangements of loud- 


no difference 


339 


speakers were also 
negative results. 


compared, again with 


REFERENCES 


1. Ecan, J. P., Carrererre, E. C., & Tuwine, 
E. J. Some factors affecting multi- 
channel listening. J. acoust. Soc. Amer., 
1954, 26, 774-782. 

2. Poutron, E. C. Two-channel listening. /. 
exp. Prychol., 1953, 46, 91-96. 


(Received October 24, 1955) 








Journal of Experimental Psychology 
Vol. ri , 1956 = 


SURPRISE AS A FACTOR IN THE VON RESTORFF EFFECT 


R. T. GREEN 


University College, London 


More recent work on the von 
Restorff effect has settled one or two 
questions but raised others. While 
ruling out the possibility of the 
relative quantity of the two sorts of 
material as a governing factor, Siegel 
(3) found the critical isolated items 
were remembered eight times as often 
as the massed. In a repetition of this 
experiment Saul and Osgood (2) 
found the same general tendency but 
the ratio of recalled isolated to massed 
material fell to three to one. This 
discrepancy is largely accounted for 
by the fact that whereas the second 
isolated item in Siegel’s series was 
recalled more often than any other, 
this same item did not stand out at 
all in the serial position curves ob- 
tained by Saul and Osgood. As these 
authors point out: “.. . only item 
No. 3 [the first isolated] stands out 
in our data and this might merely 
reflect the fact that item No. 3 was 
always the first item of a new 
type... ” (2, p. 375). 

Delayed recall data in this latter 
study also appeared to make suspect 
the Gestalt theory of engram changes 
with time. Neither study attempted 
a statistical analysis to substantiate 
their claims. 

The hypothesis to be examined here 
is that it is not isolation in a temporal 
series as first suggested by von 
Restorff (4) that produces better 
recall, but the “surprise” aroused by 
being unexpectedly presented with a 
verbal item after a series of numerical 
items or vice versa. 


It is generally accepted that the 
emotional tone of an event is a crucial 
factor governing the degree of attention 


paid to it and hence the likelihood of its 
recall. What is needed in this context 
is some objective definition of “sur- 
prise.” There seem to be two obvious 
approaches to this problem. “Surprise” 
could be defined and measured in terms 
of autonomic responses such as the GSR. 
Alternatively “surprise” could be related 
to the forseeability of an event according 
to a prediction made on an inductive 
basis. “Surprise’’ in this sense would be 
a property of the structure of the 
temporal series and deducible from it. 
For example, in accordance with this 
definition a verbal item preceded by a 
long series of numerical items has more 
surprise value than the same item pre- 
ceded by a shorter series. And the 
first item of a new sort of material to 
appear has more surprise value than later 
equally isolated items. For the pur- 
poses of the present experiment it is this 
definition that has been adopted, al- 
though there is no reason why this 
criterion of surprise should not correlate 
closely with an autonomic criterion. As 
a corollary we may note that it is only 
the preceding items that affect the 
surprise value. Only a change can 
produce a surprise, whereas isolation is 
concerned also with the structure of 
succeeding items. 

In addition the question of whether 
the engrams obey the Gestalt laws or 
not will be examined, using a delayed 
recall after the manner of Saul and 
Osgood. 


Metuop 


There are several ways of testing the “sur- 
prise” hypothesis. The one adopted in this 
study is to construct a list with two equally 
isolated items and predict that the first will be 
recalled significantly more often than the second 
even after serial position effects and other rele- 
vant variables have been balanced out. 

As far as possible the method used by Siegel 
was followed except that, although the same 


340 











VON RESTORFF EFFECT 


TABLE 1 


Srructure or THe Four Lists 








| List A | List B 

1 GUB 20 1 581 20 

2 KEV 19 2 763 19 

3 DAC 18 3 258 18 

4 406 17 4 406 17 

5 TER 16 5 179 16 

6 WAJ 15 6 738 15 

7 SIH 4 7 SIH 14 

8 RUL 13 8 341 13 

9 VOM 12 9 269 12 

10 FIP 11 10 417 11 
11 417 10 11 FIP 10 
12 269 9 12 VOM 9 
13° (341 s 13° RUL 7 
14 562 7 14 562 7 
1S 738 6 1S WA] 6 
16 179 5 16 TER 5 
17 HOF 4 17 HOF 4 
18 258 3 18 DAC 3 
19 763 2 19 KEV 2 
20 581 1 20 GUB 1 
List a tT List b T 





items were employed, their arrangement was 
different and required four groups of Ss. The 
same four critical items were adopted but 
besides controlling for amount of each type of 
material, uniqueness of item was balanced 
between the lists, as were the serial position 
effects, in the following manner (Table 1). 
With Group A the isolated items were 406 and 
HOF and the isolated positions 4 and 1/7. With 
Group B these same items were embedded 
among similar items while retaining their serial 
position. Thus what were isolated items and 
positions in the first list became massed in the 
second and vice versa. 

The main hypothesis to be tested is that the 
second isolated item is recalled less often than 
the first even when their degree of isolation is 
the same and all other factors likely to influence 
recall have been controlled. This necessitates 
two more groups since there is a possibility, 
admittedly remote, that 406 and S/// are unique 
in being more affected by surrounding items 
than 562 and HOF. So lists A and B were 
reversed and became the material for Groups 
a and b. 

The Ss were adults, most of whom were at- 
tending an evening institute.! The groups were 
brought up to equal numbers by introducing a 








1 1 am much indebted to the staff and students 
of Toynbee Hall for their cooperation in this 
matter. 





341 


few other naive adult Ss from other sources. 
There were 23 Ss in each group, making a total 
of 92. The appropriate material was presented 
to each group after giving formal instructions as 
follows : 

“This is a memory experiment. 
you a series of cards one ata time. There is an 
item on each card. After the series is complete 
you will have one and a half minutes in which to 
write on the slip provided as many of the items 
as you can recall. The order makes no differ- 
ence. I am interested only in the total number 
of items that you recall correctly. Please 
attend carefully and avoid making any com- 
ments.” 

Each item was printed in India ink on an 
8 X 5-in. white card in letters 2} in. high. The 
series was shown only once, each item being 
displayed for 5 sec. The Ss then had 1.5 min. 
in which to recall the material on the slips 
provided by E. Before leaving, £ intimated 
that he would be back in 50 min. to ask some 
general questions on the material. When he 
returned he in fact simply handed out new slips 
of paper and again gave 1.5 min. for recall. In 
no case did anyone require longer than the 
stipulated time. 

Statistical treatment was confined to the four 
critical items. Each S was scored as follows. 
Each isolated item recalled scored + 1 and each 
massed item — | and the algebraic sum taken. 
Thus each S could score between + 2 and — 2. 
This, for the sake of convenience, we may call 
the V score, since the von Restorff effect predicts 
that this distribution will not be symmetrical 
about zero but will be skewed towards the 
positive values. 

Secondly, the frequency of recall for each of 
the four critical items was computed. ‘This we 
may call the S score since it is postulated that 
the surprise effect will favor the first over the 
second isolated item regardless of serial position 
effects and other relevant factors already men- 
tioned. 


I shall show 


Resutts 


To present an over-all picture, 
although no statistical measures are 
applied to these data, two serial 
position curves are plotted in Fig. 1, 
one curve for the Lists A and a where 
4 and 17 were the isolated positions, 
and 7 and 1/4 the massed, and the 
other for Lists B and bin which these 
positions were reversed in isolation 
value. Apart from Positions 4 and 
7 the two curves are practically in- 





342 








kK. T. GREEN 
















































































40) 
4 
4 
$ 30 ° © LISTS ABO fc} 
« 

6. -=—-=-— LISTS 8 6 b 

$ \ 

201 \ A 
> o ] \ ow / 
z 4 a , / 
a - \o ae | 2 \ / \¢ 4 “A\ /\ 
: —_ PCG Ye Ire 

° -) 2 3 4 5 6 7 86 9 WOH 12 83 14 '° 16 17 ‘8 19 20 

SERIAL POSITION 
Lists A6e/26117] 11 136) 16) 219 1 6)13} 6) 17}16}6),7) 218 6 | 6 | 29 
LisTs 080,27) 20] 6 | 12) 11 | 3}22) 51 6 m1is|] 6] | 3 443 8 18 
Fic. 1. Serial position curves for immediate recall. 


distinguishable and the advantage of 
isolated over homogeneous conditions 
in Positions 14 and 17 is clearly 
minimal. 

Table 2 sets out the V scores for 
both immediate and delayed recall. 
We see by inspection that the scores 
are skewed in the expected direction. 
To test the significance of this trend 
we may use either x* or thet test. In 
order to calculate the expected fre- 
quencies, we assume the null hy- 
pothesis and use the observed fre- 
quencies to obtain a best estimate of 
the probabilities for each of the five 





value of x? thus obtained for im- 
mediate recall, 28.74, with 2 df, is 
significant beyond the .001 level. 
Alternatively, a t test comparing the 
actual mean score of .468 with a 
theoretical mean of zero gives t = 5.78 
(df = 91, P < .0O1). On the face of 
it, then, we have found overwhelming 
support for von Restorff’s position. 
Now we may consider the data 
which are crucial to this experiment. 
The question which has to be an- 
swered when considering the above 
result is how much of the apparent 
von Restorff effect is contributed by 

















possible values of the V score. The the first and how much by the second 
TABLE 2 TABLE 3 
Distrisution or V Scores Distraisution or S Scores or Four 
(N = 92) Criticat Irems 
V Score Isolated | Massed 
Recall Recall ~ -——- 
+2 +1 0 | - 1 -2 let 2nd | let 2nd 
Immediate 6 40 38 7 1 Immediate 58 18 21 12 
Delayed 3 29 46 14 0 Delayed 42 ll 21 12 






































VON RESTORFF EFFECT 


isolated item. Table 3 sets out the 
recall frequencies for the first and 
second isolated items alongside the 
scores for these identical items in the 
same serial positions when surrounded 
by homogeneous material. 

The simplest way of bringing out 
the main point of this experiment 
statistically is to test the ratio for 
each of the four crucial positions 
separately. In Position 4 the two 
items were recalled 36 times when 
isolated, and the same two items only 
12 times when massed. A sign test 
gives P < .0004. For Position 7 the 
scores of 22 and 9 yield P < .O15. 
Whereas Position 14 with scores of 
11 and 7 gives P > .1, and Position 
17 with scores of 7 and 5 is obviously 
not significant. 


This treatment is not entirely satisfactory. 
It is one thing to show that the difference is 
significant for the first isolated items and not for 
the second, but quite another to show that these 
differences in significance are themselves sig- 
nificant. It is like showing that the mean of 
one group differs significantly from zero though 
that of another does not, while leaving un- 
touched the question of whether the two groups 
differ significantly from one another. 

To test this more rigorous hypothesis it is 
necessary to set up a 2 X 2 X 2 contingency 
table including the forgetting frequencies. This 
leads to x* = 4.79, which with 1 df gives 
P = .03 for a one-tail test. 

Before we can accept this result, we must 
justify the use of x* with data that are possibly 
not independent in the statistical sense. After 
all, any given S would contribute to all four 
totals in Table 3 if he recalled all four critical 
items, and it is quite possible that recalling any 
one of these four items affects the probability of 
the same S recalling any of the others. How- 
ever, if we test for independence we find that in 
the statistical sense these data are not con- 
taminated. 


Discussion 


There is a striking correspondence 
between Saul and Osgood’s results and 
some of the present findings. Both in 
the matter of the degree to which isolated 
material is favored in recall and the 


343 


question of how much of this effect is 
contributed by the second isolated item 
Siegel’s data are at variance with both 
this and Saul and Osgood’s study. 
Furthermore, as with Saul and Osgood’s 
experiment, the delayed recall results 
are in flat contradiction to the Gestalt 


prediction. Instead of the isolated items 
increasing their advantage over the 
massed the opposite trend occurs. With 


46 positive and 8 negative scores for 
immediate recall, there is contrasted 32 
positive and 14 negative scores for the 
delayed data. This result is due entirely 
to forgetting of isolated items which 
drop from 76 to 53, while the number of 
massed items recalled remains constant 
at 33. For the Gestalt hypothesis to be 
borne out the recall scores for massed 
items should not only fall, but do so more 
rapidly than the isolated scores. The 
actual trend in the opposed direction 
turns out to be significant at the 8% 
level on a one-tail test using a 2 KX 2 & 2 
contingency analysis. Another way of 
testing the hypothesis is to compare the 
immediate and delayed V scores of Table 
2. Using only three catagories—posi- 
tive, zero and negative—we obtain 
x? = 4.9, » = 2, 08 < P < 0.09 (two- 
tail). 

Greatest theoretical interest, however, 
attaches to the fact that it is only the 
first isolated item that benefits in recall. 

If isolation is a relevant factor in 
recall it is difficult to see why it should 
be much less effective towards the end of 
the list. There seems to be no reason 
according to von Restorff why the 
figure-ground relation should cease to 
operate under these conditions. 

On the other hand, a serial interference 
interpretation on the basis put forward 
by Gibson (1) would, on the face of it, 
appear to be able to deal with this result. 
If we assume that proactive interference 
is much stronger than retroactive inter- 
ference, then the first isolated item will 
suffer less because no similar material 
precedes it, whereas this is not the case 
for the second isolated item. 

This raises the question of whether 
“surprise” as defined is operationally 
different from proactive interference. 








344 


As far as the present experiment goes! 


we are bound to conclude that there ig 
no such distinction. In order to produce 
“surprise” conditions, it was necessary 
to make the critical item the first of its 
kind in the list. In point of fact it is 
possible to resolve this dilemma and 
oppose these two concepts by making 
repetition the unexpected change. Later 
work on these lines (in preparation) has 
shown that the proactive interference 
interpretation is contradicted and the 
“surprise” hypothesis borne out. 


SUMMARY 


Evidence is produced which throws grave 
suspicion on the accepted nature of the von 
Restorff effect. It would seem that it is not 
isolation but an unexpected change that favors 
recall. What appears to be a highly significant 
result of the sort obtained by later workers in 
this area turns out on analysis to be incompatible 
with von Restorff’s position, although the 


R. T. GREEN 


alternative behaviorist explanation of serial 
interference seems to offer an alternative ex- 
planation. The Gestalt theory concerning what 
happens to engrams with time also is not 
supported. 


REFERENCES 


1. Gisson, E. J. A systematic application of 
the concepts of generalization and dif- 


ferentiation to verbal learning. Psychol. 
Rev., 1940, 47, 196-229. 
2. Sau, E. V., & Oscoop, C. E. Perceptual 


organization of materials as a factor in- 
fluencing ease of learning and degree of 
retention. J. exp. Psychol., 1950, 40, 
372-379. 

3. Siecet, P. S. Structure effects within a 
memory series. J. exp. Psychol., 1943, 
33, 311-316. 

4. von Restorrr, H. Uber die Wirkung von 
Bereichsbildungen im Spurenfeld. Psy- 
chol. Forsch. 1933, 18, 299-342. 


(Received October 31, 1955) 

















